Friday, November 29, 2019

Load CSV File data to oracle in ODI 12c

CSV File loading in ODI 12c

This paper describes on loading CSV file to oracle. To accomplish this task I have used below O/S and Oracle Data integrator.

  • Operating system used : Windows 10 
  • Data Integrator Version 12.2.1.3.0
  • Oracle 12c Version 12.2.1
       Prerequisites :
  • Oracle data integrator studio should be installed.
  • Master and work repository should be created
  • Oracle database 12c version 12.2.1 (You can perform using Oracle 11g also
  • Oracle database 11g,12c.

ODI Technologies used to perform loading data from CSV(Comma Separated Values / Comma Delimited) to Oracle Database table.

  • File Technology
  • Oracle Technology
In this paper we will load data from CSV to Oracle. Before loading data we have to understand the structure of CSV and for what purpose this CSV data we are loading.

We have taken CSV for IPL(Indian Premier League)  It’s a T20 Circket Tournament played every year in India and will be used for further analysis. For further information on IPL you can visit https://www.iplt20.com/

In this paper we have taken Matches.csv.

Let us understand the CSV Files its structure and what data are there in respective columns.

Matches.Csv

Column Name
Data Type
Description
ID
Number
Used for Unique Sequence
SEASON
Number
Used for Season or Year in which year this matches are played
CITY
Character
In which city the match was played
DATE
Date
On which day match was played
TEAM1
Character
Team1 Name
TEAM2
Character
Team2 Name
TOSS_WINNER
Character
Team name who won the toss
TOSS_DECISION
Character
Decision of toss winner eigher batting or fielding
RESULT
Character
Result of match either normal/Tie/no result between the team
DL_APPLIED
Number
Is Duckworth Lewis method applied or not i.e. either 0 for not applied or 1 for applied
WINNER
Character
Winning team
WIN_BY_RUNS
Number
Win by runs
WIN_BY_WICKETS
Number
Win by wickets
PLAYER_OF_MATCH
Character
Name of player of the match
VENUE
Character
Name of stedium where the match was played
UMPIRE1
Character
Name of Umpire 1
UMPIRE2
Character
Name of Umpire 2
UMPIRE3
Character
Name of Umpire 3


Now we have enough understanding of CSV file we are going to load to oracle.

create tables for Matches.CSV file using create table command.

To create table login using Sql Developer/sqlplus or any other tool you have installed on your machine.

I have created user cricket for this example and will now create table name matches where csv data will be loaded.

create table MATCHES
(
  MATCH_ID        NUMBER(5) not null,
  SEASON          NUMBER(4) not null,
  CITY            VARCHAR2(20),
  MATCH_DATE      DATE not null,
  TEAM1           VARCHAR2(30) not null,
  TEAM2           VARCHAR2(30) not null,
  TOSS_WINNER     VARCHAR2(30) not null,
  TOSS_DECISION   VARCHAR2(5) not null,
  MATCH_RESULT    VARCHAR2(10) not null,
  DL_APPLIED      NUMBER(1) not null,
  WINNER          VARCHAR2(30),
  WIN_BY_RUNS     NUMBER(3) not null,
  WIN_BY_WICKETS  NUMBER(2) not null,
  PLAYER_OF_MATCH VARCHAR2(30),
  VENUE           VARCHAR2(60) not null,
  UMPIRE1         VARCHAR2(30) not null,
  UMPIRE2         VARCHAR2(30) not null,
  UMPIRE3         VARCHAR2(30)
);

Step 1. Now start the ODI developer studio and connect to work repository and select topology tab -> expand Technologies and expand file.


Right click FILE_GENERIC and select Open.  


The File Generic tab will open on right side.

Keep  Name as it is
Host : localhost
User : [OS User is optional]
Password : [Password is optional]
Keep other options as it is.


Now click on JDBC option do not change anything for this option. Now click on Test Connection


Click Test button.


If test is successful you will see the message window Successful Connection. Click Ok.










Step 2. Now on topology tab right click FILE_GENERIC  and select New Physical Schema.











The Physical schema panel opens on right side.
On physical schema enter path in directory (schema) and Work schema  where your CSV files are stored. In this case my directory path is [D:\ODI_Learning\IPL Match ODI Loading] yours may be different path. You cannot browse it but you have to enter the path. Save your work.























Click OK button on context message window.





On Topology tab expand FILE_GENERIC you can see the FILE_GENERIC. D:\ODI_Learning\IPL Match ODI Loading physical schema is created.





Step 3. On Topology tab expand  Logical Architecture then right click file and select New Logical Schema.

























On right side you will see Logical Schema tab is opened. Enter IPL_Match_ODI_Loading in Name : and select Physical schema we have created earlier. Now Save Logical schema and close the tab.











Step 4. Up to this we have created physical schema and Logical schema. Now we will create Model.
Now click on Designer Tab -> Expand Models ->  Click on New Model Folder->  Select New Model Folder.

















A New Model folder tab is opened. On Definition tab Name : Enter IPL_Match_ODI_Model and Save this model folder.










A new model folder is created.






Now right click on IPL_Match_ODI_Model folder and select New Model.



















This will open Model panel on right side. Enter below information on Definition tab.

Name                 : IPL_MATCH_ODI
Code                  : will automatically filled as Name value
Technology        : Select File
Logical Schema : Select IPL_Match_Odi_Loading which was created during Logical Schema creation.

Save your work.


















IPL_MATCH_ODI Model is created.











Right click on IPL_MATCH_ODI and select New Datastore.
Now on Datastore enter following information
Name                  : IPL_Match_ODI
Datastore Type   Select Table
Resource Name  : Browse the CSV file and select Matches.CSV File. 

You can observe that it automatically navigates to the Folder D:\ODI_Learning\IPL Match ODI Loading.

































Click on Files tab.

On Files tab pull down File Format and select Delimited.

Heading (Number of Lines) : select 1 because in our file First line is heading of data present in Matches.CSV file.

on Field separator select Other and enter [,] Comma in field and it will automatically adds Hexadecimal[\u002c] character for Comma.

























Click on Attributes tab. And Press Reverse Engineer button. If changes are not saved save it.











You will see the Header is displays and datatype,physical length etc.



























Now Edit Type, Physical length, length as per below image.



























Now expand  IPL_MATCH_ODI and expand IPL_MATCH_ODI(matches.csv)  and  expand Attributes.
































Now right click IPL_MATCH_ODI(matches.csv)  and select View Data.


















You can see the data available in Matches.csv file.

















Summary :
We have created Physical Schema, Logical Schema and Model for reading data from Matchs.csv file.
In next step we will now create Physical schema to load data in Oracle Table.

Step 5. In topology tab select Physical Architecture panel and right click on Oracle and select New Data Server option.


















On right side Data Server panel will open. Enter below information on Definition tab.
Name                                          : Cricket_DB
Instance / dblink (Data Server)   : in my case pdborcl yours may be different
user                                             : Database Schema  user name in my case [cricket]
Password                                    : Database user password



















Now click on JDBC tab and enter below details.
On JDBC Driver select oracle.jdbc.OracleDriver if not automatically populated.
JDBC URL: jdbc:oracle:thin:@localhost:1521/pdborcl in my case yours may be different initially it will populate as [jdbc:oracle:thin:@<host>:<port/ServiceName>]

replace <host> with your host name.
replace <port> with your port number on which database is installed.
replace <ServiceName> with your service name
















Now click on Test Connection button it will open pop window now press Test button.











If Test connection is successful it will show you Successful message. Click ok now your database connection is successful.










Now on left side you will see the Cricket_DB data server is created. Right click on Cricket_DB and select New Physical Schema.


















On right panel Physical Schema is opened.  Enter below details.

Name                              : Do not change
Schema (Schema)           :Select CRICKET from drop down
Schema(Work Schema)  : Select CRICKET from drop down

Now Save your work.



















Now create Logical schema.
On Topology -> Logical Architecture -> right click Oracle -> select New Logical Schema.


















On right Logical Schema panel is opened. On Definition tab select physical Schema drop down and select Cricket_DB.CRICKET and Save your work.














Step 6. Now we will create Target Model.
Click on Designer tab -> expand Model folder and click on model folder icon and select New Model.










On right new model panel will be opened. Enter below details on Definition tab.
Name                 : IL_Matches_DB
Code                  : will be automatically filled
Technology        : Select Oracle from drop down
Logical Schema : Select LTS_Cricket from drop down


Now click on Reverse Engineering tab and then press Reverse Engineer button.












Now on left side you can see the IPL_Matches_DB Model is created. Now expand it and check the Attributes.






















Now let us check the proper connectivity is happened and view the data available in Matches table.



















On right side Data view panel opened. As of now we do not have any data available in Matches table.











Step 7. Now Its time to create Project and import Knowledge module which will used to create mappings to load data from Csv file to Table. In previous version mappings was known as Interface.

On left Designer tab Expand Projects tab and click project icon and select New Project.








.



On right side a new project panel  is open. Enter information as below.
Name: Load_Matches_CSV_to_Table











On left side you can check project is created.










Step 8. Now we will Import Knowledge module. Right click Knowledge module and select Import Knowledge Modules.




















For loading our file we will use following loading KMs
               1)      IKM SQL Incremental Update
               2)      LKM File to SQL
On Knowledge module select IKM SQL Incremental Update



Now scroll down to find LKM File to SQL and select. Click Ok.

























Click Close.




































Verify that both KMs are imported in project.



















Step 9. Now we will create Mappings.
Expand Designer -> Projects -> Load_Matches_CSV_to_Table -> First Folder and right click Mappings and select New Mapping.

















A new mapping popup will be shown. Enter following details and click ok.




















On right side a New blank window will open to drag source and target mappings.





















Now Expand Designer -> Models -> IPL_Match_ODI_Model -> IPL_MATCH_ODI   drag and drop IPL_Match_ODI(Matches.csv) to right side opened window.









































Now Expand IPL_Matches_DB  and drag and drop MATCHES to right side.

























Now we have both the mappings one for source and another for target. Now we have to link both the source and target mappings. And map the columns.

You can see the connectors on both side of the boxes.















Now we will connect both the boxes with connectors. Click IPL side connector and drage it to MATCHES connector. And click OK.







































You can observe that columns which are matched with the name are mapped automatically by dark gray symbol 

those are not matched need to match manually by drag from left to right.
















Now we will match the column which are not matched automatically by drag and drop from left to right. For example IPL side date Is not matched. We will match date column with MATCH_DATE.



















We will now match other column same way. The list of columns need to be matched are as below.
                1)      Id -> Match_ID
                2)      Date -> Match_Date
                3)      Result -> Match_Result
Now all columns are matched. Now save your work.
Click on MATCHES and in properties select Target and select Incremental Update as Integration type.






























Now our mapping Is completed. We can run mapping and load data to database.

Now click on green arrow as shown below.






Click Ok on Run dialog box.


Click OK.










Open Operator tab and expand Date -> Today and expand the latest job. You can see green circle which executed successfully.




















Now verify our table MATCHES in Sql Developer by selecting our table as below. As you can see that our data is loaded successfully. In CSV total records are 577 and loaded are also 577.
























This concludes loading data from CSV file to Oracle table.

You can reach me at taral.vyas@yahoo.com or taralvyas74@gmail.com