Monday, 26 December 2016

Talend In One Day


  Basic Components
Explanation
tFileInputDelimited
Reads any Delimited .txt or .csv file and
tFileOutputDelimited
Silimar and generates a delimited output file -.csv
tFIleInputXML
Reads any XML file as input and extract the Xpath.
Store the XML in any location.
Specify the Xpath Loop expression <employee>
Fields to extract<empname>,<DOB>
tRowGenerator
Genarates rows sequentially.
Go to Talend Data Fabric and select the functions accordingly.
OrderID,Date sequences gets auto generated.
Similar to a Oracle Sequence
tFixedFlowInput
Manually creates Fixed dataset.
BasicSettings->Create 2 columns State_Cd,State_Nm and enter static values
Change No Of Rows and it generates static rows accordingly
InlineTable allows to add Multiple rows and multiplies with the rows * no. of columns
tMap
Performs Join between 2 input files Expr.Key performs the actual join.
row1 and row2 need to be joined with a common coloumn in the Expr.key
tSample
Generates sequential numbers as a field in the output.Range can be changed in the Basic settings
tJava
Run an Java standalone Core Java code in Basic settings
tSalesforceBulkExec
Give the SalesForce Webservice SOAP URL with username and password.
Get the mySF0.1 from the Salesforce Metadata
tLibraryLoad
Upload any external Library(JAR file) which can be called from tJava.The tJava
utility can be invoked with OnComponentOk


Big Data Components
Explanation
tHDFSConnection
Establishes a connection to a HDFS system
Distribution:Apache
NameNodeURI:hdfs://10.140.6.125:9000/
Username: <username of the virtual machine>
tHDFSConnection OnsubjobOK can be connected to tRowGenerator and tLogrow which ends with tHDFSOutput_1
tHDFSOutput
Generates the Output and stores in the Hadoop system
Filename:\user\hadoop\talendoutput\out.txt
Action: Create
Run the Program
Go to HDFS,
hadoop fs -ls /user/hadoop/talendoutput
hadoop fs -cat /user/hadoop/talendoutput/out.txt
tHDFSInput
To retrive data from Hadoop HDFS file.
Make the tHDFSConnection
Make the tHDFSInput
Enter Filename /user/hadoop/talendoutput/out.txt and Fieldseparator
Connect with tLogrow
 

No comments:

Post a Comment