About Talend

Talend Open Studio operates as a code generator allowing data transformation scripts and underlying programs to be generated either in Java (OR) Perl. Its GUI is made of a metadata repository and a graphical designer. The metadata repository contains the definitions and configuration for each job. The information in the metadata repository is used by all of the components of Talend Open Studio.

Course Details : http://talend-training.blogspot.in/2013/04/talend-training-course-details.html

Talend Infobright Connector Problem


Problem
Exception in component tInfobrightOutput
java.sql.SQLException: Access denied for user ‘test’@‘192.168.3.2’ (using password: YES)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1056)
    at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:957)
    at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2938)
    at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1601)
    at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1710)
    at com.mysql.jdbc.Connection.execSQL(Connection.java:2430)
    at com.mysql.jdbc.Connection.execSQL(Connection.java:2359)
    at com.mysql.jdbc.Statement.execute(Statement.java:719)
    at com.infobright.io.InfobrightNamedPipeLoader$ExecutionThread.run(InfobrightNamedPipeLoader.java:207)


while running the script, table is created in Target database and unable to insert data then throws the above exception.
for this i am placing target table as mysql [in same infobright databse] insted of tInfobrightOutput[infobright loader] the script is executed successfully.


Solution
Here is the explanation. The connector needs the “file” privilege because it is doing a LOAD command from a named pipe (FIFO). The standard MySQL output component uses INSERT statements instead, so it doesn’t need this privilege.


“the FILE privileges are global and cannot be applied to a single database”
So correct SQL Statement is:
GRANT FILE ON *.* to test@localhost;


use This url for downloading talend infobright Connector
http://www.infobright.org/Downloads/Contributed-Software/

Make one Value as Context Variable in Talend

My Incremental Load Logic using context variable(with maxid) in Talend


I faced one problem for making my value as variable. below is the scenario.


I am doing my incremental load job and this will be based on  one auto incremented column.  I need to get the data from my source based on last inserted id in my target database ( After full load execution ). For this I need to get the max value for that id from target database and then i need to use that value in my select statement  with greater than condition in Source Database.


To Resolve this I have created Context variable . 
Then get the  max id from source database  and stored in variable.
Used this variable in my Select Statement for getting only Delta(new) data


Steps:
  • Created contextvariable 
  • Created MysqlInput job to query for max row as maxidfield
  • Created tJavaRow job  to assign the variable  context.setProperty("contextvariable",row1.maxidfield)
  • Then Run, MySqlInput job with an amended query:  "SELECT ... FROM .... where id > "+(String)((String)context.getProperty("contextvariable"))

How to Handle Null Pointer Exception In Talend ?

Handle Null Pointer Exception In Talend

If you got a null pointer Exception in any Component you can solve it with simple steps.

  • Identify the columnname in your job.
  • Do the null checking with simple if condition.
    •   row1.columnName==null?defaultvalue:row1.columnName
       specify the required result in defaultvalue to change the data from null to some value

Note: Do this null verification before doing any operations on column. This will prevent your job from NullPointerException 

How to Optimize tuniqueRow Component in Talend

Java heap space error at tUniqRow in Talend


  • To Optimise this component ,Try  to get the unique records by using min number columns in uniq condition.
  • var-char(max) ,text data type columns in unique condition will decrease the performance of the component If you are using tuniqrow component with huge data we may get java heap space Exception. 
    (By storing the data on disk while getting unique record from huge data)
    Below are the steps to resolve the issue.  
    • Click on tunique
    • Go to component window
    • Select you unique condition columns
    • Go to Advanced Settings
    • Click on Use of Disk Option
    • Then select Directory For temp files. (This Directory Structure must be already available in the System. If it is not available the it wouldn't create any directory. This will give you Error  )
    •  Then Execute the jobBut this will be slow with compared to the speed of the job with out Use of Disk Option for small amount of data. 

How to Optimize Tmap Component in Talend

Optimize Tmap component in Talend


If Lookup table has few rows:

  • Open tmap select lookup model as Load Once .This will load lookup data one time before mail flow starts.
  • This Lookup data will be stored in memory then main flow execution is very fast with comparing with lookup data in memory.

IF lookup table data is very large:
Talend cannont store the lookup table data in memory. you will get java heap space Execption. To resolve this issue follow the below steps

  • open tmap
  • go to lookup table
  • click on tmap settings
  • select the value for store temp data property to True 
  • click on ok
  • In Tmap properties basic settings  set Temp Data Directory path by browsing folder
  • Go to Advance settings
  • set  max buffer size(nb of row) to some value based on  lookup condition data type

In my jobs lookup stressful for  2 mill records with 3 GB of memory  of integer data type.

Exception in component tFileOutputDelimited

Error: 
Exception in component tFileOutputDelimited_1
java.io.FileNotFoundException: C:/testfile.csv (The process cannot access the file because it is being used by another process)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(Unknown Source)
at java.io.FileOutputStream.<init>(Unknown Source)

Solution:1  If file is already Available in that location

  • This file is already opened. You need to close the file and execute this job again.

Solution:2  If file not exist

  •  Go to components --> go to advanced settings -->  check create directory if not exists property

Then Execute the job. file will be created along with directory structure.



Populate Time Dimension in Data Warehouse

Stored Procedure for Time Dimension in Data Warehouse
This procedure will populate the Warehouse time Dimension. You need to give only start date and end date then it will generate the year,quarter, month, week (based on month),day name, day , week number for each day.

Note: This is for MySQL database only


  DELIMITER '/';
  DROP PROCEDURE IF EXISTS generatedate /
  CREATE PROCEDURE generatedate()
  BEGIN
  declare i DATE;
  set i = '2004-12-31'; //Starting Date
  DROP TABLE IF EXISTS DIM_DATE_TIME;
  create table DIM_DATE_TIME(dates DATE ,year integer(4),quarter varchar(2),month        varchar(10),monthnumber int(2),week varchar(5),day int(2),dayname varchar(10),quarterno    int(2),weekno int(2));
  set autocommit=0;
  while i < '2011-01-01' // Ending Date
  DO
  SET i = DATE_ADD(i, INTERVAL 1 DAY);
  insert into DIM_DATE_TIME   values(DATE(i),YEAR(i),concat('Q',QUARTER(i)),MONTHNAME(i),MONTH(i),concat('Week',WEE K(i, 5) - WEEK(DATE_SUB(i, INTERVAL DAYOFMONTH(i) - 1 DAY), 5) + 1),DAY(i),DAYNAME(i),QUARTER(i),WEEK(i, 5) - WEEK(DATE_SUB(i, INTERVAL DAYOFMONTH(i) - 1 DAY), 5) + 1);
  END WHILE;
  COMMIT;

  END;/

  DELIMITER ;


call the proceure:
call generatedate();

mysql> select * from DIM_DATE_TIME limit 1\G
*************************** 1. row ***************************
dates: 2005-01-01
year: 2005
quarter: Q1
month: January
monthnumber: 1
week: Week1
day: 1
dayname: Saturday
quarterno: 1
weekno: 1

1 row in set (0.00 sec)

Talend Training Course Details


Training Overview
  • Overview of TOS
  • Talend installation
  • Creating New Project
  • Overview of Repository, Designer, pallete
  • Creating New Job
                - Using different components in talend
  • Explain different talEnd components with real Time scenarios 
  • Creating meta Data for the job
  • Creating job with different input and different output
  • Executing jobs
           - on windows environment
            -On linux Environment        

  • Making multiple jobs to single job
  • Creating new routines
  • Creating jobs with context and variables
  • Export  and import jobs
  • Export and import projects
  • Overview of business model
  • Talend best Practices

ETL Talend Training in Hyderabad


Contact me on the below number for training on Talend ETL tool... ;)
- Raja
Talend Trainer
+91-924-736-0004