CSV File loading in ODI 12c
This paper describes on loading CSV file to oracle. To
accomplish this task I have used below O/S and Oracle Data integrator.
- Operating system used : Windows 10
- Data Integrator Version 12.2.1.3.0
- Oracle 12c Version 12.2.1
Prerequisites :
- Oracle data integrator studio should be installed.
- Master and work repository should be created
- Oracle database 12c version 12.2.1 (You can perform using Oracle 11g also
- Oracle database 11g,12c.
ODI Technologies used to perform loading data from CSV(Comma Separated Values / Comma Delimited) to Oracle Database table.
- File Technology
- Oracle Technology
In this paper we will load data from CSV to Oracle. Before
loading data we have to understand the structure of CSV and for what purpose
this CSV data we are loading.
We have taken CSV for IPL(Indian Premier
League) It’s a T20 Circket Tournament
played every year in India and will be used for further analysis. For further
information on IPL you can visit https://www.iplt20.com/
In this paper we have taken Matches.csv.
Let us understand the CSV Files its structure and what
data are there in respective columns.
Matches.Csv
Column Name
|
Data Type
|
Description
|
ID
|
Number
|
Used for Unique Sequence
|
SEASON
|
Number
|
Used for Season or Year in which year
this matches are played
|
CITY
|
Character
|
In which city the match was played
|
DATE
|
Date
|
On which day match was played
|
TEAM1
|
Character
|
Team1 Name
|
TEAM2
|
Character
|
Team2 Name
|
TOSS_WINNER
|
Character
|
Team name who won the toss
|
TOSS_DECISION
|
Character
|
Decision of toss winner eigher
batting or fielding
|
RESULT
|
Character
|
Result of match either normal/Tie/no
result between the team
|
DL_APPLIED
|
Number
|
Is Duckworth Lewis method applied or
not i.e. either 0 for not applied or 1 for applied
|
WINNER
|
Character
|
Winning team
|
WIN_BY_RUNS
|
Number
|
Win by runs
|
WIN_BY_WICKETS
|
Number
|
Win by wickets
|
PLAYER_OF_MATCH
|
Character
|
Name of player of the match
|
VENUE
|
Character
|
Name of stedium where the match was
played
|
UMPIRE1
|
Character
|
Name of Umpire 1
|
UMPIRE2
|
Character
|
Name of Umpire 2
|
UMPIRE3
|
Character
|
Name of Umpire 3
|
Now we have enough understanding of CSV file we are going to
load to oracle.
create tables for Matches.CSV file using
create table command.
To create table login using Sql Developer/sqlplus or any other
tool you have installed on your machine.
I have created user cricket for this example and will now
create table name matches where csv data will be loaded.
create table MATCHES
(
MATCH_ID NUMBER(5) not null,
SEASON NUMBER(4) not null,
CITY VARCHAR2(20),
MATCH_DATE DATE not null,
TEAM1 VARCHAR2(30) not null,
TEAM2 VARCHAR2(30) not null,
TOSS_WINNER VARCHAR2(30) not null,
TOSS_DECISION VARCHAR2(5) not null,
MATCH_RESULT VARCHAR2(10) not null,
DL_APPLIED NUMBER(1) not null,
WINNER VARCHAR2(30),
WIN_BY_RUNS NUMBER(3) not null,
WIN_BY_WICKETS NUMBER(2) not null,
PLAYER_OF_MATCH VARCHAR2(30),
VENUE VARCHAR2(60) not null,
UMPIRE1 VARCHAR2(30) not null,
UMPIRE2 VARCHAR2(30) not null,
UMPIRE3 VARCHAR2(30)
);
Step 1. Now start
the ODI developer studio and connect to work repository and select topology tab
-> expand Technologies and expand file.
Right
click FILE_GENERIC and select Open.
The File Generic tab will open on right side.
Keep Name as it is
Host : localhost
User : [OS User is optional]
Password : [Password is optional]
Host : localhost
User : [OS User is optional]
Password : [Password is optional]
Keep
other options as it is.
Now click on JDBC option do not change anything for this
option. Now click on Test Connection
Click Test button.
If test is successful you will see the message window
Successful Connection. Click Ok.
Step 2. Now on topology tab right click FILE_GENERIC and select New Physical Schema.
The Physical schema panel opens on right side.
On physical schema enter path in directory (schema) and Work schema where your CSV files are stored. In this case my directory path is [D:\ODI_Learning\IPL Match ODI Loading] yours may be different path. You cannot browse it but you have to enter the path. Save your work.
On physical schema enter path in directory (schema) and Work schema where your CSV files are stored. In this case my directory path is [D:\ODI_Learning\IPL Match ODI Loading] yours may be different path. You cannot browse it but you have to enter the path. Save your work.
Click OK button on context message window.
On Topology tab expand FILE_GENERIC you can see the
FILE_GENERIC. D:\ODI_Learning\IPL Match ODI Loading physical schema is created.
Step 3. On Topology tab expand Logical Architecture then right click file
and select New Logical Schema.
On right side you will see Logical Schema tab is opened.
Enter IPL_Match_ODI_Loading in Name : and select Physical schema we have
created earlier. Now Save Logical schema and close the tab.
Step 4. Up to this we have created physical schema and
Logical schema. Now we will create Model.
Now click on Designer Tab -> Expand Models -> Click on New Model Folder-> Select New Model Folder.
A New Model folder tab is opened. On Definition tab Name :
Enter IPL_Match_ODI_Model and Save this model folder.
A new model folder is created.
Now right click on IPL_Match_ODI_Model folder and select New
Model.
This will open Model panel on right side. Enter below
information on Definition tab.
Name : IPL_MATCH_ODI
Code : will automatically filled as Name value
Technology : Select File
Logical Schema : Select IPL_Match_Odi_Loading which was created during Logical Schema creation.
Code : will automatically filled as Name value
Technology : Select File
Logical Schema : Select IPL_Match_Odi_Loading which was created during Logical Schema creation.
Save your work.
Right click on IPL_MATCH_ODI and select New Datastore.
Now on Datastore enter following information
Name : IPL_Match_ODI
Datastore Type : Select Table
Resource Name : Browse the CSV file and select Matches.CSV File.
Datastore Type : Select Table
Resource Name : Browse the CSV file and select Matches.CSV File.
You can
observe that it automatically navigates to the Folder D:\ODI_Learning\IPL Match
ODI Loading.
Click on Files tab.
On Files tab pull down File Format and select Delimited.
Heading (Number of Lines) : select 1 because in our file First line is heading
of data present in Matches.CSV file.
on Field separator select Other and enter [,] Comma in field and it will automatically adds Hexadecimal[\u002c] character for Comma.
Click on Attributes tab. And Press Reverse Engineer button.
If changes are not saved save it.
You will see the Header is displays and datatype,physical
length etc.
Now Edit Type, Physical length, length as per below image.
Now expand
IPL_MATCH_ODI and expand IPL_MATCH_ODI(matches.csv) and
expand Attributes.
Now right click IPL_MATCH_ODI(matches.csv) and select View Data.
You can see the data available in Matches.csv file.
Summary :
We have created Physical Schema, Logical Schema and Model
for reading data from Matchs.csv file.
In next step we will now create
Physical schema to load data in Oracle Table.
Step 5. In topology tab select Physical Architecture panel
and right click on Oracle and select New Data Server option.
On right side Data Server panel will open. Enter below
information on Definition tab.
Name : Cricket_DB
Instance / dblink (Data Server) : in my case pdborcl yours may be different
user : Database Schema user name in my case [cricket]
Password : Database user password
Instance / dblink (Data Server) : in my case pdborcl yours may be different
user : Database Schema user name in my case [cricket]
Password : Database user password
Now click on JDBC tab and enter below details.
On JDBC Driver select oracle.jdbc.OracleDriver
if not automatically populated.
JDBC URL: jdbc:oracle:thin:@localhost:1521/pdborcl in my case yours may be different initially it will populate as [jdbc:oracle:thin:@<host>:<port/ServiceName>]
JDBC URL: jdbc:oracle:thin:@localhost:1521/pdborcl in my case yours may be different initially it will populate as [jdbc:oracle:thin:@<host>:<port/ServiceName>]
replace <host> with your host name.
replace <port> with your port number on which database is installed.
replace <ServiceName> with your service name
replace <port> with your port number on which database is installed.
replace <ServiceName> with your service name
Now click on Test Connection button it will open pop window
now press Test button.
If Test connection is successful it will show you Successful
message. Click ok now your database connection is successful.
Now on left side you will see the Cricket_DB data server is
created. Right click on Cricket_DB and select New Physical Schema.
On right panel Physical Schema is opened. Enter below details.
Name : Do not change
Schema (Schema) :Select CRICKET from drop down
Schema(Work Schema) : Select CRICKET from drop down
Schema (Schema) :Select CRICKET from drop down
Schema(Work Schema) : Select CRICKET from drop down
Now Save your work.
Now create Logical schema.
On Topology -> Logical Architecture -> right click
Oracle -> select New Logical Schema.
On right Logical Schema panel is opened. On Definition tab
select physical Schema drop down and select Cricket_DB.CRICKET and Save your
work.
Step 6. Now we will create Target Model.
Click on Designer tab -> expand Model folder and click on
model folder icon and select New Model.
On right new model panel will be opened. Enter below details
on Definition tab.
Name : IL_Matches_DB
Code : will be automatically filled
Technology : Select Oracle from drop down
Logical Schema : Select LTS_Cricket from drop down
Code : will be automatically filled
Technology : Select Oracle from drop down
Logical Schema : Select LTS_Cricket from drop down
Now click on Reverse Engineering tab and then press Reverse
Engineer button.
Now on left side you can see the IPL_Matches_DB Model is
created. Now expand it and check the Attributes.
Now let us check the proper connectivity is happened and
view the data available in Matches table.
On right side Data view panel opened. As of now we do not
have any data available in Matches table.
Step 7. Now Its time to create Project and import Knowledge
module which will used to create mappings to load data from Csv file to Table.
In previous version mappings was known as Interface.
On left Designer tab Expand Projects tab and click project
icon and select New Project.
.
On right side a new project panel is open. Enter information as below.
Name: Load_Matches_CSV_to_Table
Name: Load_Matches_CSV_to_Table
On left side you can check project is created.
Step 8. Now we will Import Knowledge module. Right click
Knowledge module and select Import Knowledge Modules.
For loading our file we will use following loading KMs
1)
IKM SQL
Incremental Update
2)
LKM File to SQL
On Knowledge module select IKM
SQL Incremental Update
Now scroll down to find LKM
File to SQL and select. Click Ok.
Click Close.
Verify that both KMs are imported in project.
Step 9. Now we will create Mappings.
Expand Designer -> Projects -> Load_Matches_CSV_to_Table
-> First Folder and right click Mappings and select New Mapping.
A new mapping popup will be shown. Enter following details
and click ok.
On right side a New blank window will open to drag source
and target mappings.
Now Expand Designer -> Models -> IPL_Match_ODI_Model
-> IPL_MATCH_ODI drag and drop IPL_Match_ODI(Matches.csv)
to right side opened window.
Now Expand IPL_Matches_DB
and drag and drop MATCHES to right side.
Now we have both the mappings one for source and another for target. Now we have to link both the source and target mappings. And map the columns.
You can see the connectors on both side of the boxes.
Now we will connect both the boxes with connectors. Click
IPL side connector and drage it to MATCHES connector. And click OK.
You can observe that columns which are matched with the name are mapped automatically by dark gray symbol
those are not matched need to match manually by drag from left to right.
Now we will match the column which are not matched
automatically by drag and drop from left to right. For example IPL side date Is
not matched. We will match date column with MATCH_DATE.
We will now match other column same way. The list of columns
need to be matched are as below.
1)
Id -> Match_ID
2)
Date -> Match_Date
3)
Result -> Match_Result
Now all columns are matched. Now save your work.
Click on MATCHES and in properties select Target and select
Incremental Update as Integration type.
Now our mapping Is completed. We can run mapping and load
data to database.
Now click on green arrow as shown below.
Click Ok on Run dialog box.
Click OK.
Open Operator tab and expand Date -> Today and expand the
latest job. You can see green circle which executed successfully.
Now
verify our table MATCHES in Sql Developer by selecting our table as below. As
you can see that our data is loaded successfully. In CSV total records are 577
and loaded are also 577.
This concludes loading data from CSV file to Oracle table.