Service:
package org.wso2.carbon.service;
import org.apache.axis2.context.MessageContext;
import org.apache.axis2.context.ServiceContext;
public class spsessionservctxservice {
public int multiply(int x, int y){
System.out.println("Setting context!!!!!");
System.out.println(x*y);
ServiceContext serviceContext =
MessageContext.getCurrentMessageContext().getServiceContext();
serviceContext.setProperty("VALUE", new Integer(x * y));
return 0;
}
public int getResult(){
System.out.println("Getting results!!!!!");
ServiceContext serviceContext =
MessageContext.getCurrentMessageContext().getServiceContext();
return ((Integer) serviceContext.getProperty("VALUE")).intValue();
}
}
You can download the service from:
https://svn.wso2.org/repos/wso2/trunk/commons/qa/qa-artifacts/app-server/session-management/spsessionservctxservice.aar
Deploy them in two instances in the same cluster.
Client:
package org.wso2.carbon.service;
import org.apache.axis2.client.Options;
import org.apache.axis2.client.ServiceClient;
import org.apache.axis2.AxisFault;
import org.apache.axis2.addressing.EndpointReference;
import org.apache.axis2.context.ConfigurationContext;
import org.apache.axis2.context.ConfigurationContextFactory;
import java.util.Random;
import java.rmi.RemoteException;
import org.wso2.carbon.service.SpsessionservctxserviceStub;
public class Client {
public static void main(String[] args) throws AxisFault {
Random random = new Random();
for(int x=0; x<10; x++){
int j = random.nextInt(1000);
int k = random.nextInt(2000);
ConfigurationContext configContext = ConfigurationContextFactory.createConfigurationContextFromFileSystem("/home/chamara/Carbon4/wso2as-5.0.0-SNAPSHOT_node1/repository", null);
SpsessionservctxserviceStub stb1 = new SpsessionservctxserviceStub(configContext, "http://10.200.2.75:9763/node1/services/spsessionservctxservice/");
//SpsessionservctxserviceStub.Multiply request = new SpsessionservctxserviceStub.Multiply();
ServiceClient sc = stb1._getServiceClient();
//sc.engageModule("addressing");
stb1._getServiceClient().engageModule("addressing");
Options opts = sc.getOptions();
opts.setManageSession(true);
sc.setOptions(opts);
//request.setX(j);
//request.setY(k);
try {
stb1.multiply(j,k);
} catch (RemoteException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
try {
Thread.sleep(1);
} catch (InterruptedException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
SpsessionservctxserviceStub stb2 = new SpsessionservctxserviceStub(configContext,"http://10.200.2.75:9764/node2/services/spsessionservctxservice/");
stb2._setServiceClient(sc);
stb2._getServiceClient().getOptions().setTo(new EndpointReference("http://10.200.2.75:9764/node2/services/spsessionservctxservice/"));
try {
/*
SpsessionservctxserviceStub.GetResultResponse res = stb2.getResult();
System.out.println(res.get_return());
*/
System.out.println("Result is: " + stb2.getResult());
} catch (RemoteException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
Set the urls in multiply method to one server's service endpoint and getRequest method to the other server's endpoint.
The multiply method will put the state in the cluster and the cluster will keep it and replicate among nodes. The second method getRequest will get the kept state from any other node of the cluster.
Search This Blog
Monday, July 2, 2012
Friday, May 4, 2012
OpenLDAP Clustering Guide
This is a complete guide for OpenLDAP installing and clustering in Mirror Mode.
First you need to have installed BerkeleyDB as the data-store for OpenLDAP. You can Download BerkeleyDB from
http://www.oracle.com/technetwork/products/berkeleydb/downloads/index.html
Also make sure that you need g++ and all the essential dependencies installed in your machine. If you are using ubuntu
$sudo apt-get install build-essential will install all the dependencies.
Here I have used BDB 4.8.30 version.
1. Create a directory for BDB installation
mkdir /home/chamara/OpenLDAP/BerkeleyDB
2. Unzip the BerkeleyDB distribution
tar -xvf db-4.8.30.tar.gz
3. Go to the directory
/db-4.8.30/build_unix
4. Run the following command
/build_unix$ ../dist/configure --prefix=/home/chamara/OpenLDAP/BerkeleyDB
--prefix will set the final BDB installation path. There are lots of parameters which can be set at the installation. For a complete reference please refer the BDB documentation.
5. Finally you will get the following output in the end of configure
configure: creating ./config.status
config.status: creating Makefile
config.status: creating db_cxx.h
config.status: creating db_int.h
config.status: creating clib_port.h
config.status: creating include.tcl
config.status: creating db.h
config.status: creating db_config.h
config.status: executing libtool commands
Also if you see the current directory there will be a make file created
-rw-r--r-- 1 chamara chamara 81K 2012-05-05 05:37 Makefile
So that now you can work on the installation by continuing the following procedure
$make
$make install
Now if you check the BerkeleyDB directory
/BerkeleyDB$ ls -lah
total 24K
drwxr-xr-x 6 chamara chamara 4.0K 2012-05-05 05:51 .
drwxr-xr-x 5 chamara chamara 4.0K 2012-05-05 05:36 ..
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 bin
drwxr-xr-x 13 chamara chamara 4.0K 2012-05-05 05:51 docs
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 include
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 lib
6. Now you have to set the following parameters for OpenLDAP to find where the BerkeleyDB is installed
$CPPFLAGS="-I/home/chamara/OpenLDAP/BerkeleyDB/include"
$export CPPFLAGS
$LDFLAGS="-L/usr/local/lib -L/home/chamara/OpenLDAP/BerkeleyDB/lib -R/home/chamara/OpenLDAP/BerkeleyDB/lib"
$export LDFLAGS
$LD_LIBRARY_PATH="/home/chamara/OpenLDAP/BerkeleyDB/lib"
$export LD_LIBRARY_PATH
* Now BerkeleyDB is installed properly
7. Now you need to have a OpenLDAP distribution. You can download from
http://www.openldap.org/software/download/
I have used openldap-stable-20100719.tgz
Unzip the distribution
$ tar -xvf openldap-stable-20100719.tgz
Now go to the OpenLDAP distribution
$ cd openldap-2.4.23/
$ ls -alh
total 1.5M
drwxr-xr-x 10 chamara chamara 4.0K 2010-06-30 05:23 .
drwxr-xr-x 5 chamara chamara 4.0K 2012-05-05 05:36 ..
-rw-r--r-- 1 chamara chamara 244K 2005-10-30 03:37 aclocal.m4
-rw-r--r-- 1 chamara chamara 3.8K 2010-04-14 01:52 ANNOUNCEMENT
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:36 build
-rw-r--r-- 1 chamara chamara 42K 2010-06-29 20:53 CHANGES
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 clients
-rwxr-xr-x 1 chamara chamara 1.1M 2010-04-20 00:52 configure
-rw-r--r-- 1 chamara chamara 92K 2010-04-19 22:23 configure.in
drwxr-xr-x 7 chamara chamara 4.0K 2012-05-05 05:36 contrib
-rw-r--r-- 1 chamara chamara 2.3K 2010-04-14 01:52 COPYRIGHT
drwxr-xr-x 8 chamara chamara 4.0K 2012-05-05 05:36 doc
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 include
-rw-r--r-- 1 chamara chamara 4.4K 2010-04-14 01:52 INSTALL
drwxr-xr-x 8 chamara chamara 4.0K 2012-05-05 05:36 libraries
-rw-r--r-- 1 chamara chamara 2.2K 2003-11-25 00:42 LICENSE
-rw-r--r-- 1 chamara chamara 1.1K 2010-04-14 01:52 Makefile.in
-rw-r--r-- 1 chamara chamara 3.5K 2010-04-14 01:52 README
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 servers
drwxr-xr-x 5 chamara chamara 4.0K 2010-06-30 05:23 tests
8. Run the following command
$ ./configure --prefix=/home/chamara/OpenLDAP/OpenLDAP
again, as in BDB installation --prefix will set the final OpenLDAP installation path
9. Now you will see a MakeFile is created
-rw-r--r-- 1 chamara chamara 9.3K 2012-05-05 06:07 Makefile
So that, going through the following order you will have installed OpenLDAP
$ make depend
$ make
$ make test
$ make install
10. Now we are done with the OpenLDAP installation. If you check the destination directory
/OpenLDAP$ ls -lah
total 40K
drwxr-xr-x 10 chamara chamara 4.0K 2012-05-05 06:35 .
drwxr-xr-x 6 chamara chamara 4.0K 2012-05-05 06:06 ..
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 bin
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 06:35 etc
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 include
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 lib
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 libexec
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 sbin
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 06:35 share
drwxr-xr-x 4 chamara chamara 4.0K 2012-05-05 06:35 var
Now we have to configure OpenLDAP installation. I will refer this directory as $OpenLDAP_HOME
11. change the directory to $OpenLDAP_HOME
12. Append the following in the etc/openldap/ldap.conf
BASE dc=test,dc=com
URI ldap://172.16.246.1:1389
You have to mention a valid ipaddress
12. Create DB_CONFIG file
$ cp etc/openldap/DB_CONFIG.example etc/openldap/DB_CONFIG
13. Now configure the etc/openldap/slapd.conf
Append following;
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/cosine.schema
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/nis.schema
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/inetorgperson.schema
After;
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/core.schema
Find;
BDB database definitions
Change them into;
suffix "dc=test,dc=com"
rootdn "cn=admin,dc=test,dc=com"
rootpw admin123
Add followings for the Mirror Mode Replication;
index objectClass eq
index entryCSN,entryUUID eq
syncrepl rid=002
provider=ldap://{$ip-address of the other OpenLDAP instance$}:1389/
type=refreshAndPersist
retry="60 30 300 +"
searchbase="dc=test,dc=com"
bindmethod=simple
binddn="cn=admin,dc=test,dc=com"
credentials=admin123
mirrormode TRUE
overlay syncprov
syncprov-checkpoint 100 10
syncprov-reloadhint true
syncprov-nopresent true
syncprov-sessionlog 100
For my machine; {$ip-address of the other OpenLDAP instance$} is
172.16.246.128
14. Now the configuration of OpenLDAP node 1 is done. Follow the same procedure for the OpenLDAP node2, and you will only have to change the ip-addresses in the ldap.conf and slapd.conf
15. Start OpenLDAP using following commands
$ ./libexec/slapd -h ldap://172.16.246.1:1389
or
$ ./libexec/slapd -h ldap://172.16.246.1:1389 -d3 (debug mode)
16. Create following files to add the default users to OpenLDAP store
$ vi build_root_ou.ldif
INSERT;
dn: dc=test,dc=com
objectClass: dcObject
objectClass: organizationalUnit
dc: test
ou: testou
$ vi add_user_ou.ldif
INSERT;
dn: ou=user,dc=test,dc=com
objectClass: organizationalUnit
ou: users
$ vi add_groups_ou.ldif
INSERT;
dn: ou=Groups,dc=test,dc=com
objectClass: organizationalUnit
ou: Groups
$ vi add_user_uid.ldif
INSERT;
dn: uid=admin,ou=user,dc=test,dc=com
cn: Admin
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
cn: WSO2
sn: Open Source Middleware
uid: admin
userPassword: {SSHA}A1toNdJpoocuptlnEYkKWZa45oxag4GG
Use;
$OpenLDAP_HOME/sbin$ ./slappasswd
to encrypt the password and get the SSHA value
/sbin$ ./slappasswd
New password:
Re-enter new password:
{SSHA}A1toNdJpoocuptlnEYkKWZa45oxag4GG
I used 'admin123' as password
17. Now use the following command to add the .ldif files to the ldap store
(Now I'm in the $OpenLDAP_HOME/etc/openldap dir. where all the .ldif files saved in)
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f build_root_ou.ldif
Enter LDAP Password:
adding new entry "dc=test,dc=com"
Follow the procedure;
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_user_ou.ldif
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_groups_ou.ldif
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_user_uid.ldif
18. The next step is to load balance the two OpenLDAP instance nodes by may be a hardware load balancer or a similar software LB.
If you can use ApacheDirectoryStudio and connect to one of the node's OpenLDAP ldap store you will see the ldap tree we created.
Network Parameters;
Authentication;
Now click on Finish button and you will be connected to the ldap store. You can browse and add users from there.
First you need to have installed BerkeleyDB as the data-store for OpenLDAP. You can Download BerkeleyDB from
http://www.oracle.com/technetwork/products/berkeleydb/downloads/index.html
Also make sure that you need g++ and all the essential dependencies installed in your machine. If you are using ubuntu
$sudo apt-get install build-essential will install all the dependencies.
Here I have used BDB 4.8.30 version.
1. Create a directory for BDB installation
mkdir /home/chamara/OpenLDAP/BerkeleyDB
2. Unzip the BerkeleyDB distribution
tar -xvf db-4.8.30.tar.gz
3. Go to the directory
/db-4.8.30/build_unix
4. Run the following command
/build_unix$ ../dist/configure --prefix=/home/chamara/OpenLDAP/BerkeleyDB
--prefix will set the final BDB installation path. There are lots of parameters which can be set at the installation. For a complete reference please refer the BDB documentation.
5. Finally you will get the following output in the end of configure
configure: creating ./config.status
config.status: creating Makefile
config.status: creating db_cxx.h
config.status: creating db_int.h
config.status: creating clib_port.h
config.status: creating include.tcl
config.status: creating db.h
config.status: creating db_config.h
config.status: executing libtool commands
Also if you see the current directory there will be a make file created
-rw-r--r-- 1 chamara chamara 81K 2012-05-05 05:37 Makefile
So that now you can work on the installation by continuing the following procedure
$make
$make install
Now if you check the BerkeleyDB directory
/BerkeleyDB$ ls -lah
total 24K
drwxr-xr-x 6 chamara chamara 4.0K 2012-05-05 05:51 .
drwxr-xr-x 5 chamara chamara 4.0K 2012-05-05 05:36 ..
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 bin
drwxr-xr-x 13 chamara chamara 4.0K 2012-05-05 05:51 docs
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 include
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:51 lib
6. Now you have to set the following parameters for OpenLDAP to find where the BerkeleyDB is installed
$CPPFLAGS="-I/home/chamara/OpenLDAP/BerkeleyDB/include"
$export CPPFLAGS
$LDFLAGS="-L/usr/local/lib -L/home/chamara/OpenLDAP/BerkeleyDB/lib -R/home/chamara/OpenLDAP/BerkeleyDB/lib"
$export LDFLAGS
$LD_LIBRARY_PATH="/home/chamara/OpenLDAP/BerkeleyDB/lib"
$export LD_LIBRARY_PATH
* Now BerkeleyDB is installed properly
7. Now you need to have a OpenLDAP distribution. You can download from
http://www.openldap.org/software/download/
I have used openldap-stable-20100719.tgz
Unzip the distribution
$ tar -xvf openldap-stable-20100719.tgz
Now go to the OpenLDAP distribution
$ cd openldap-2.4.23/
$ ls -alh
total 1.5M
drwxr-xr-x 10 chamara chamara 4.0K 2010-06-30 05:23 .
drwxr-xr-x 5 chamara chamara 4.0K 2012-05-05 05:36 ..
-rw-r--r-- 1 chamara chamara 244K 2005-10-30 03:37 aclocal.m4
-rw-r--r-- 1 chamara chamara 3.8K 2010-04-14 01:52 ANNOUNCEMENT
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 05:36 build
-rw-r--r-- 1 chamara chamara 42K 2010-06-29 20:53 CHANGES
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 clients
-rwxr-xr-x 1 chamara chamara 1.1M 2010-04-20 00:52 configure
-rw-r--r-- 1 chamara chamara 92K 2010-04-19 22:23 configure.in
drwxr-xr-x 7 chamara chamara 4.0K 2012-05-05 05:36 contrib
-rw-r--r-- 1 chamara chamara 2.3K 2010-04-14 01:52 COPYRIGHT
drwxr-xr-x 8 chamara chamara 4.0K 2012-05-05 05:36 doc
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 include
-rw-r--r-- 1 chamara chamara 4.4K 2010-04-14 01:52 INSTALL
drwxr-xr-x 8 chamara chamara 4.0K 2012-05-05 05:36 libraries
-rw-r--r-- 1 chamara chamara 2.2K 2003-11-25 00:42 LICENSE
-rw-r--r-- 1 chamara chamara 1.1K 2010-04-14 01:52 Makefile.in
-rw-r--r-- 1 chamara chamara 3.5K 2010-04-14 01:52 README
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 05:36 servers
drwxr-xr-x 5 chamara chamara 4.0K 2010-06-30 05:23 tests
8. Run the following command
$ ./configure --prefix=/home/chamara/OpenLDAP/OpenLDAP
again, as in BDB installation --prefix will set the final OpenLDAP installation path
9. Now you will see a MakeFile is created
-rw-r--r-- 1 chamara chamara 9.3K 2012-05-05 06:07 Makefile
So that, going through the following order you will have installed OpenLDAP
$ make depend
$ make
$ make test
$ make install
10. Now we are done with the OpenLDAP installation. If you check the destination directory
/OpenLDAP$ ls -lah
total 40K
drwxr-xr-x 10 chamara chamara 4.0K 2012-05-05 06:35 .
drwxr-xr-x 6 chamara chamara 4.0K 2012-05-05 06:06 ..
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 bin
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 06:35 etc
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 include
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 lib
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 libexec
drwxr-xr-x 2 chamara chamara 4.0K 2012-05-05 06:35 sbin
drwxr-xr-x 3 chamara chamara 4.0K 2012-05-05 06:35 share
drwxr-xr-x 4 chamara chamara 4.0K 2012-05-05 06:35 var
Now we have to configure OpenLDAP installation. I will refer this directory as $OpenLDAP_HOME
11. change the directory to $OpenLDAP_HOME
12. Append the following in the etc/openldap/ldap.conf
BASE dc=test,dc=com
URI ldap://172.16.246.1:1389
You have to mention a valid ipaddress
12. Create DB_CONFIG file
$ cp etc/openldap/DB_CONFIG.example etc/openldap/DB_CONFIG
13. Now configure the etc/openldap/slapd.conf
Append following;
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/cosine.schema
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/nis.schema
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/inetorgperson.schema
After;
include /home/chamara/OpenLDAP/OpenLDAP/etc/openldap/schema/core.schema
Find;
BDB database definitions
Change them into;
suffix "dc=test,dc=com"
rootdn "cn=admin,dc=test,dc=com"
rootpw admin123
Add followings for the Mirror Mode Replication;
index objectClass eq
index entryCSN,entryUUID eq
syncrepl rid=002
provider=ldap://{$ip-address of the other OpenLDAP instance$}:1389/
type=refreshAndPersist
retry="60 30 300 +"
searchbase="dc=test,dc=com"
bindmethod=simple
binddn="cn=admin,dc=test,dc=com"
credentials=admin123
mirrormode TRUE
overlay syncprov
syncprov-checkpoint 100 10
syncprov-reloadhint true
syncprov-nopresent true
syncprov-sessionlog 100
For my machine; {$ip-address of the other OpenLDAP instance$} is
172.16.246.128
14. Now the configuration of OpenLDAP node 1 is done. Follow the same procedure for the OpenLDAP node2, and you will only have to change the ip-addresses in the ldap.conf and slapd.conf
15. Start OpenLDAP using following commands
$ ./libexec/slapd -h ldap://172.16.246.1:1389
or
$ ./libexec/slapd -h ldap://172.16.246.1:1389 -d3 (debug mode)
16. Create following files to add the default users to OpenLDAP store
$ vi build_root_ou.ldif
INSERT;
dn: dc=test,dc=com
objectClass: dcObject
objectClass: organizationalUnit
dc: test
ou: testou
$ vi add_user_ou.ldif
INSERT;
dn: ou=user,dc=test,dc=com
objectClass: organizationalUnit
ou: users
$ vi add_groups_ou.ldif
INSERT;
dn: ou=Groups,dc=test,dc=com
objectClass: organizationalUnit
ou: Groups
$ vi add_user_uid.ldif
INSERT;
dn: uid=admin,ou=user,dc=test,dc=com
cn: Admin
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top
cn: WSO2
sn: Open Source Middleware
uid: admin
userPassword: {SSHA}A1toNdJpoocuptlnEYkKWZa45oxag4GG
Use;
$OpenLDAP_HOME/sbin$ ./slappasswd
to encrypt the password and get the SSHA value
/sbin$ ./slappasswd
New password:
Re-enter new password:
{SSHA}A1toNdJpoocuptlnEYkKWZa45oxag4GG
I used 'admin123' as password
17. Now use the following command to add the .ldif files to the ldap store
(Now I'm in the $OpenLDAP_HOME/etc/openldap dir. where all the .ldif files saved in)
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f build_root_ou.ldif
Enter LDAP Password:
adding new entry "dc=test,dc=com"
Follow the procedure;
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_user_ou.ldif
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_groups_ou.ldif
$ ../../bin/ldapadd -D "cn=admin,dc=test,dc=com" -W -x -f add_user_uid.ldif
18. The next step is to load balance the two OpenLDAP instance nodes by may be a hardware load balancer or a similar software LB.
If you can use ApacheDirectoryStudio and connect to one of the node's OpenLDAP ldap store you will see the ldap tree we created.
Network Parameters;
Authentication;
Now click on Finish button and you will be connected to the ldap store. You can browse and add users from there.
Wednesday, September 8, 2010
SEEDMiner
SeedMiner – the Scalable Data Mining Framework
Data mining can be defined as an attempt to semi automatically discovering previously unknown useful patterns from large data sets. With the emergence of databases capable of handling terabytes of data, data mining has grown as a separate area which is largely used in Business, Science and Engineering and marketplace surveys to make predictions.
Though many data mining applications are available nowadays, they are either targeted towards a very specific data set or application domain with scalability or generalized to multiple application domains without scalability.
Our Project is basically aiming at implementing a Data Mining Framework which will enable the practitioners to build data mining solutions easily and scale up according to their requirements, while preserving the efficiency and performance of their applications.
Process View of SeedMiner
The process architecture takes into account some non-functional requirements such as performance and availability. It addresses issues of concurrency and distribution, of system’s integrity, of fault-tolerance, and how the main abstractions from the logical view fit within the process architecture on which thread of control is an operation for an object actually executed.
Regarding the process view of the system, I can introduce three main processes of our framework. Those are introduces as layers in the design.
First one is the Data Feeding Layer. This will be an interface which provides a rich set of methods for feeding data from an external source to the internal storage. This layer holds the data feeders provided by the framework, and the capability is given to add new data feeders to the framework. The job of a data feeder would be to read data from an external source tailor it accordingly and feed data to the storage. Different data feeders can be attached to the DFL to support a multitude of data sources.
The data feeding process will supply a great demand for the framework’s performance. The objective of having a data feeding layer in our framework can be described as follows. The raw data inside a database is configured itself as horizontal data. What it really means is every column inside a table in the database is saving its values in a horizontal way. If we need to get a particular data item in a one column, we have to take the whole row and extract the data element we need. This gives a great drawback especially in data mining, which is we have to read entire columns every time we need to extract some data. That will decrease the performance of the framework in high charge. What we suggest here is extracting the data from data database first, and arranges them in a vertical structure. This will hold data exactly as the word mean, in a vertical manner. So if we need to extract a data item we can directly access the data item itself without worrying about garbage values. This will increase the performance of the framework in great demand.
When the data feeding is considered separately in several columns, this process is independent to each column. So we can achieve some concurrency here. Having multiple threads inside the data feeding later will increase the performance of the system.
As I mentioned in an earlier paragraph, data feeding layer will be implemented as which different kinds of data feeders can be attached into the DFL where it supports the ability to connect to multitude of data sources. This will raise the scalability of the framework.
The next very important part of our framework is the algorithm layer which the actual mining happens. In this process the vertically structured data will be exposed into different operations. The bit sliced data (dynamic bit-set values) will be entered to the algorithmic processes and many operations like AND, OR, NOT will be happened appropriately.
This is a process where concurrency applied in a higher level. Each bit slice is operated with the adjacent slices. This process is individual for each bit values in the dataset. So we can add several threads to the process and separate the bit slices to parts and do operations. This will enhance the performance of the framework.
Mainly SeedMiner Framework is targeted on Scalability. The algorithm developers should be able to develop their own algorithms on the framework and scale up based on their requirements. In order to achieve this SeedMiner mainly focuses on some facts. The data storage layer and the algorithm layer are handled as separate modules in the framework. By doing this SeedMiner comes up with a common API for the application layer which can be used by application developers to use it and build their own solutions without worrying about the fundamental data structure. The interface for application development will be created according to the standard specification and because of that developers can easily plug in their algorithms which are built with the standard.
Data mining can be defined as an attempt to semi automatically discovering previously unknown useful patterns from large data sets. With the emergence of databases capable of handling terabytes of data, data mining has grown as a separate area which is largely used in Business, Science and Engineering and marketplace surveys to make predictions.
Though many data mining applications are available nowadays, they are either targeted towards a very specific data set or application domain with scalability or generalized to multiple application domains without scalability.
Our Project is basically aiming at implementing a Data Mining Framework which will enable the practitioners to build data mining solutions easily and scale up according to their requirements, while preserving the efficiency and performance of their applications.
Process View of SeedMiner
The process architecture takes into account some non-functional requirements such as performance and availability. It addresses issues of concurrency and distribution, of system’s integrity, of fault-tolerance, and how the main abstractions from the logical view fit within the process architecture on which thread of control is an operation for an object actually executed.
Regarding the process view of the system, I can introduce three main processes of our framework. Those are introduces as layers in the design.
First one is the Data Feeding Layer. This will be an interface which provides a rich set of methods for feeding data from an external source to the internal storage. This layer holds the data feeders provided by the framework, and the capability is given to add new data feeders to the framework. The job of a data feeder would be to read data from an external source tailor it accordingly and feed data to the storage. Different data feeders can be attached to the DFL to support a multitude of data sources.
The data feeding process will supply a great demand for the framework’s performance. The objective of having a data feeding layer in our framework can be described as follows. The raw data inside a database is configured itself as horizontal data. What it really means is every column inside a table in the database is saving its values in a horizontal way. If we need to get a particular data item in a one column, we have to take the whole row and extract the data element we need. This gives a great drawback especially in data mining, which is we have to read entire columns every time we need to extract some data. That will decrease the performance of the framework in high charge. What we suggest here is extracting the data from data database first, and arranges them in a vertical structure. This will hold data exactly as the word mean, in a vertical manner. So if we need to extract a data item we can directly access the data item itself without worrying about garbage values. This will increase the performance of the framework in great demand.
When the data feeding is considered separately in several columns, this process is independent to each column. So we can achieve some concurrency here. Having multiple threads inside the data feeding later will increase the performance of the system.
As I mentioned in an earlier paragraph, data feeding layer will be implemented as which different kinds of data feeders can be attached into the DFL where it supports the ability to connect to multitude of data sources. This will raise the scalability of the framework.
The next very important part of our framework is the algorithm layer which the actual mining happens. In this process the vertically structured data will be exposed into different operations. The bit sliced data (dynamic bit-set values) will be entered to the algorithmic processes and many operations like AND, OR, NOT will be happened appropriately.
This is a process where concurrency applied in a higher level. Each bit slice is operated with the adjacent slices. This process is individual for each bit values in the dataset. So we can add several threads to the process and separate the bit slices to parts and do operations. This will enhance the performance of the framework.
Mainly SeedMiner Framework is targeted on Scalability. The algorithm developers should be able to develop their own algorithms on the framework and scale up based on their requirements. In order to achieve this SeedMiner mainly focuses on some facts. The data storage layer and the algorithm layer are handled as separate modules in the framework. By doing this SeedMiner comes up with a common API for the application layer which can be used by application developers to use it and build their own solutions without worrying about the fundamental data structure. The interface for application development will be created according to the standard specification and because of that developers can easily plug in their algorithms which are built with the standard.
Subscribe to:
Posts (Atom)