Oracle's Talk: Load CSV File data to oracle in ODI 12c

CSV File loading in ODI 12c

This paper describes on loading CSV file to oracle. To accomplish this task I have used below O/S and Oracle Data integrator.

Operating system used : Windows 10
Data Integrator Version 12.2.1.3.0
Oracle 12c Version 12.2.1

Prerequisites :

Oracle data integrator studio should be installed.
Master and work repository should be created
Oracle database 12c version 12.2.1 (You can perform using Oracle 11g also
Oracle database 11g,12c.

ODI Technologies used to perform loading data from CSV(Comma Separated Values / Comma Delimited) to Oracle Database table.

File Technology
Oracle Technology

In this paper we will load data from CSV to Oracle. Before loading data we have to understand the structure of CSV and for what purpose this CSV data we are loading.

We have taken CSV for IPL(Indian Premier League) It’s a T20 Circket Tournament played every year in India and will be used for further analysis. For further information on IPL you can visit https://www.iplt20.com/

In this paper we have taken Matches.csv.

Let us understand the CSV Files its structure and what data are there in respective columns.

Matches.Csv

Column Name	Data Type	Description
ID	Number	Used for Unique Sequence
SEASON	Number	Used for Season or Year in which year this matches are played
CITY	Character	In which city the match was played
DATE	Date	On which day match was played
TEAM1	Character	Team1 Name
TEAM2	Character	Team2 Name
TOSS_WINNER	Character	Team name who won the toss
TOSS_DECISION	Character	Decision of toss winner eigher batting or fielding
RESULT	Character	Result of match either normal/Tie/no result between the team
DL_APPLIED	Number	Is Duckworth Lewis method applied or not i.e. either 0 for not applied or 1 for applied
WINNER	Character	Winning team
WIN_BY_RUNS	Number	Win by runs
WIN_BY_WICKETS	Number	Win by wickets
PLAYER_OF_MATCH	Character	Name of player of the match
VENUE	Character	Name of stedium where the match was played
UMPIRE1	Character	Name of Umpire 1
UMPIRE2	Character	Name of Umpire 2
UMPIRE3	Character	Name of Umpire 3

Now we have enough understanding of CSV file we are going to load to oracle.

create tables for Matches.CSV file using create table command.

To create table login using Sql Developer/sqlplus or any other tool you have installed on your machine.

I have created user cricket for this example and will now create table name matches where csv data will be loaded.

create table MATCHES

(

MATCH_ID NUMBER(5) not null,

SEASON NUMBER(4) not null,

CITY VARCHAR2(20),

MATCH_DATE DATE not null,

TEAM1 VARCHAR2(30) not null,

TEAM2 VARCHAR2(30) not null,

TOSS_WINNER VARCHAR2(30) not null,

TOSS_DECISION VARCHAR2(5) not null,

MATCH_RESULT VARCHAR2(10) not null,

DL_APPLIED NUMBER(1) not null,

WINNER VARCHAR2(30),

WIN_BY_RUNS NUMBER(3) not null,

WIN_BY_WICKETS NUMBER(2) not null,

PLAYER_OF_MATCH VARCHAR2(30),

VENUE VARCHAR2(60) not null,

UMPIRE1 VARCHAR2(30) not null,

UMPIRE2 VARCHAR2(30) not null,

UMPIRE3 VARCHAR2(30)

);

Step 1. Now start the ODI developer studio and connect to work repository and select topology tab -> expand Technologies and expand file.

Right click FILE_GENERIC and select Open.

The File Generic tab will open on right side.

Keep Name as it is
Host : localhost
User : [OS User is optional]
Password : [Password is optional]

Keep other options as it is.

Now click on JDBC option do not change anything for this option. Now click on Test Connection

Click Test button.

If test is successful you will see the message window Successful Connection. Click Ok.

Step 2. Now on topology tab right click FILE_GENERIC and select New Physical Schema.

The Physical schema panel opens on right side.
On physical schema enter path in directory (schema) and Work schema where your CSV files are stored. In this case my directory path is [D:\ODI_Learning\IPL Match ODI Loading] yours may be different path. You cannot browse it but you have to enter the path. Save your work.

Click OK button on context message window.

On Topology tab expand FILE_GENERIC you can see the FILE_GENERIC. D:\ODI_Learning\IPL Match ODI Loading physical schema is created.

Step 3. On Topology tab expand Logical Architecture then right click file and select New Logical Schema.

On right side you will see Logical Schema tab is opened. Enter IPL_Match_ODI_Loading in Name : and select Physical schema we have created earlier. Now Save Logical schema and close the tab.

Step 4. Up to this we have created physical schema and Logical schema. Now we will create Model.

Now click on Designer Tab -> Expand Models -> Click on New Model Folder-> Select New Model Folder.

A New Model folder tab is opened. On Definition tab Name : Enter IPL_Match_ODI_Model and Save this model folder.

A new model folder is created.

Now right click on IPL_Match_ODI_Model folder and select New Model.

This will open Model panel on right side. Enter below information on Definition tab.

Name : IPL_MATCH_ODI
Code : will automatically filled as Name value
Technology : Select File
Logical Schema : Select IPL_Match_Odi_Loading which was created during Logical Schema creation.

Save your work.

IPL_MATCH_ODI Model is created.

Right click on IPL_MATCH_ODI and select New Datastore.

Now on Datastore enter following information

Name : IPL_Match_ODI
Datastore Type : Select Table
Resource Name : Browse the CSV file and select Matches.CSV File.

You can observe that it automatically navigates to the Folder D:\ODI_Learning\IPL Match ODI Loading.

Click on Files tab.

On Files tab pull down File Format and select Delimited.

Heading (Number of Lines) : select 1 because in our file First line is heading of data present in Matches.CSV file.

on Field separator select Other and enter [,] Comma in field and it will automatically adds Hexadecimal[\u002c] character for Comma.

Click on Attributes tab. And Press Reverse Engineer button. If changes are not saved save it.

To open and fetch data from CSV files select local agent and press Ok button.

You will see the Header is displays and datatype,physical length etc.

Now Edit Type, Physical length, length as per below image.

Now expand IPL_MATCH_ODI and expand IPL_MATCH_ODI(matches.csv) and expand Attributes.

Now right click IPL_MATCH_ODI(matches.csv) and select View Data.

You can see the data available in Matches.csv file.

Summary :

We have created Physical Schema, Logical Schema and Model for reading data from Matchs.csv file.

In next step we will now create Physical schema to load data in Oracle Table.

Step 5. In topology tab select Physical Architecture panel and right click on Oracle and select New Data Server option.

On right side Data Server panel will open. Enter below information on Definition tab.

Name : Cricket_DB
Instance / dblink (Data Server) : in my case pdborcl yours may be different
user : Database Schema user name in my case [cricket]
Password : Database user password

Now click on JDBC tab and enter below details.

On JDBC Driver select oracle.jdbc.OracleDriver if not automatically populated.
JDBC URL: jdbc:oracle:thin:@localhost:1521/pdborcl in my case yours may be different initially it will populate as [jdbc:oracle:thin:@<host>:<port/ServiceName>]

replace <host> with your host name.
replace <port> with your port number on which database is installed.
replace <ServiceName> with your service name

Now click on Test Connection button it will open pop window now press Test button.

If Test connection is successful it will show you Successful message. Click ok now your database connection is successful.

Now on left side you will see the Cricket_DB data server is created. Right click on Cricket_DB and select New Physical Schema.

On right panel Physical Schema is opened. Enter below details.

Name : Do not change
Schema (Schema) :Select CRICKET from drop down
Schema(Work Schema) : Select CRICKET from drop down

Now Save your work.

Now create Logical schema.

On Topology -> Logical Architecture -> right click Oracle -> select New Logical Schema.

On right Logical Schema panel is opened. On Definition tab select physical Schema drop down and select Cricket_DB.CRICKET and Save your work.

Step 6. Now we will create Target Model.

Click on Designer tab -> expand Model folder and click on model folder icon and select New Model.

On right new model panel will be opened. Enter below details on Definition tab.

Name : IL_Matches_DB
Code : will be automatically filled
Technology : Select Oracle from drop down
Logical Schema : Select LTS_Cricket from drop down

Now click on Reverse Engineering tab and then press Reverse Engineer button.

Now on left side you can see the IPL_Matches_DB Model is created. Now expand it and check the Attributes.

Now let us check the proper connectivity is happened and view the data available in Matches table.

On right side Data view panel opened. As of now we do not have any data available in Matches table.

Step 7. Now Its time to create Project and import Knowledge module which will used to create mappings to load data from Csv file to Table. In previous version mappings was known as Interface.

On left Designer tab Expand Projects tab and click project icon and select New Project.

On right side a new project panel is open. Enter information as below.
Name: Load_Matches_CSV_to_Table

On left side you can check project is created.

Step 8. Now we will Import Knowledge module. Right click Knowledge module and select Import Knowledge Modules.

For loading our file we will use following loading KMs

1) IKM SQL Incremental Update

2) LKM File to SQL

On Knowledge module select IKM SQL Incremental Update

Now scroll down to find LKM File to SQL and select. Click Ok.

Click Close.

Verify that both KMs are imported in project.

Step 9. Now we will create Mappings.

Expand Designer -> Projects -> Load_Matches_CSV_to_Table -> First Folder and right click Mappings and select New Mapping.

A new mapping popup will be shown. Enter following details and click ok.

On right side a New blank window will open to drag source and target mappings.

Now Expand Designer -> Models -> IPL_Match_ODI_Model -> IPL_MATCH_ODI drag and drop IPL_Match_ODI(Matches.csv) to right side opened window.

Now Expand IPL_Matches_DB and drag and drop MATCHES to right side.

Now we have both the mappings one for source and another for target. Now we have to link both the source and target mappings. And map the columns.

You can see the connectors on both side of the boxes.

Now we will connect both the boxes with connectors. Click IPL side connector and drage it to MATCHES connector. And click OK.

You can observe that columns which are matched with the name are mapped automatically by dark gray symbol

those are not matched need to match manually by drag from left to right.

Now we will match the column which are not matched automatically by drag and drop from left to right. For example IPL side date Is not matched. We will match date column with MATCH_DATE.

We will now match other column same way. The list of columns need to be matched are as below.

1) Id -> Match_ID

2) Date -> Match_Date

3) Result -> Match_Result

Now all columns are matched. Now save your work.

Click on MATCHES and in properties select Target and select Incremental Update as Integration type.

Now our mapping Is completed. We can run mapping and load data to database.

Now click on green arrow as shown below.

Click Ok on Run dialog box.

Click OK.

Open Operator tab and expand Date -> Today and expand the latest job. You can see green circle which executed successfully.

Now verify our table MATCHES in Sql Developer by selecting our table as below. As you can see that our data is loaded successfully. In CSV total records are 577 and loaded are also 577.

This concludes loading data from CSV file to Oracle table.

You can reach me at taral.vyas@yahoo.com or taralvyas74@gmail.com

3 comments:

Ezra AbramJanuary 12, 2021 at 1:34 AM
Thanks for the nice blog here.I was searching this one for a long time.This blog is very helpful for my studies..I got another one site also,which is same as yours Oracle ODI .Check this one also Oracle Fusion Manufacturing .Sure it will be helpful for you too..Once more iam thanking you for your creative blog.
AnonymousJanuary 8, 2024 at 8:37 AM
Can you please help, how can we get the count of the csv file through ODI in runtime. So that we can send an alert mail like count in csv and count in table after loading.

I was trying to get the count of the csv file through ODI variable but I'm getting an error like "ODI-40400: Invalid format description"

cat 'XXX.csv' |wc -l

In the variable, I have chosen the schema as file_generic schema where the csv file is present.

Thanks,
Amirtha
AnonymousOctober 3, 2024 at 2:22 PM
Cool and I have a nifty provide: Where Do You Get The Money To Renovate A House home renovation services

Oracle's Talk

Friday, November 29, 2019

Load CSV File data to oracle in ODI 12c

You can reach me at taral.vyas@yahoo.com or taralvyas74@gmail.com

3 comments:

Blog Archive