A friend is learning APEX to automate his business mentoring athletes. In his practice, he captures training performance metrics, then needs to analyze clients’ progress toward their goals over time. His instruments record an athlete’s performance data in a comma-separated values (CSV) format. Recently, he asked if APEX could load data from a CSV file.
After showing him how easy it was, he added another twist:”What if I have a directory full of CSV files, organized into subdirectories by athlete? After a week’s sessions with clients, there could be a lot of files. Will I have to import them one by one? ” Great question!
He said each directory reflects the client name, each file name includes the date and time of the session (SESSION_2023_08_11_13_45.csv
), and each file has the same format. A perfect chance to show him three more APEX features:
- Easy file uploading,
- Simplified Zip file processing, and
- Automatic data loading using a pre-defined column mapping.
The sample app you can download using the link at the end of the article illustrates the declarative configuration and small amount of code required to put these techniques into action.
Setting Up the Data Model
For simulation purposes, I chose a CSV file format that includes the name of an activity followed by two numerical values in each row like this:
AAA,333,887
BBB,123,899
CCC,445,1299
I created the following tables to store the athletes, the training sessions for each athlete, and the athlete session training activity data. I added a “staging” table to hold the uploaded data temporarily before moving it to the athlete session data table, related to the appropriate athlete’s training session.
Creating the Data Load Definition
The next step was creating an APEX Data Load Definition to capture how the CSV column data maps to columns in the EBA_DEMO_UPLOADED_SESSION_DATA
table. In the APEX App Builder, under Shared Components, in the Data Sources section, I chose Data Load Definitions and clicked Create. On step 1, as shown below, I entered the name athlete_session_data_csv
and chose the EBA_DEMO_UPLOADED_SESSION_DATA
table.
On step 2 of the wizard, I dragged and dropped an example SESSION_2023_08_04_18_20.csv
data file into the file upload drop zone.
On step 3, I unchecked the First line contains headers checkbox and mapped the values of the three CSV file columns to appropriate columns of the EBA_DEMO_UPLOADED_SESSION_DATA
table. If the CSV file had contained an initial “header” row of column names, APEX might have been able to guess the mapping automatically. However, the files my friend showed me did not have headers so I wanted to show him that use case.
To complete the wizard, I clicked on (Create Data Load).
Create Package for Handling the Upload
I used the Object Browser to create an eba_demo_bulk_data_load
package with the spec below. The from_uploaded_zip_file()
procedure accepts a single parameter for the name of the uploaded zip file to process. In the package body, I initially left the implementation of from_uploaded_zip_file()
as a “TODO” task by using a null;
statement. I’d later circle back to add in the upload processing code that I explain in detail below. For the moment, having the specification in place was enough to “wire up” the declarative interaction between the page in the next section and the procedure.
create or replace package eba_demo_bulk_data_load is
procedure from_uploaded_zip_file(
p_file_name in varchar2);
end;
Uploading the Zip File
On the Home page, I added a single page item named P1_ZIP_FILE_UPLOAD
of type File Browse… , set its Storage Type property to Table APEX_APPLICATION_TEMP_FILES, and its Purge File at property to End of Request. I also configured its File Types property to application/zip
to guide the user to only choose zip files. Finally, I added a button named Upload_and_Load_All_Data
to submit the page. This is the only configuration needed to upload a file for server-side processing during the page submit.
Processing the Zip File Contents
In the Processing tab of the Home page, I added an Invoke API page process to call the from_uploaded_zip_file()
procedure in the
package. I configured its Server-side Condition so it only executes when the eba_demo_bulk_data_load
Upload_and_Load_All_Data
button is pressed.
As shown below, I selected the p_file_name
function parameter in the Processing tab and used the Property Editor to configure its value to come from the P1_ZIP_FILE_UPLOAD
page item. A File Browse… page item’s value gives the unique name of the uploaded file in the APEX_APPLICATION_TEMP_FILES
view. We’ll use this info below to process the CSV files inside the uploaded zip file.
With these pieces in place, I had a working zip file upload page. Next, I turned my attention to the small amount of code required to “unpack” the zip file contents and process each CSV file it contained…
Loading Each CSV File in the Zip
In the implementation of the the from_uploaded_zip_file()
procedure, I started by selecting the binary contents of the uploaded zip file from the APEX_APPLICATION_TEMP_FILES
view into an l_zipfile
BLOB variable, using the unique file name passed in the p_file_name
parameter. Note that in the WHERE clause I’m using the NAME
column and not the FILENAME
column. The former is the unique name APEX assigns to the uploaded file (e.g. 12345678/athletes.zip
), while the latter is just the “plain” name of the file the end-user provided (e.g. athletes.zip
).
-- Get the uploaded zip file
select blob_content
into l_zipfile
from apex_application_temp_files
where name = p_file_name;
Next, using the handy apex_zip
package, I get the list of CSV files inside this uploaded zip file:
-- Get list of files from the zip
l_files := apex_zip.get_files(l_zipfile);
Then I loop over the file names and load them one by one into the “staging” table using the pre-defined Data Loading Definition I created above. Before each iteration, I delete all the rows from the staging table, and in the next section I’ll explain how I determine the athlete and athlete session with which to associate the loaded data from the current CSV file in the loop. Notice I use another apex_zip
package function get_file_content()
to access the contents of the current CSV file. I pass that as the value of the p_data_to_load
parameter of the apex_data_loading.load_data()
function, referencing the static id of the data loading definition as the second parameter value.
-- Process the individual files inside the zip
for j in 1..l_files.count loop
-- Start by clearing any existing rows out of the temporary upload table
delete from eba_demo_uploaded_session_data;
-- Get the contents of the current CSV file in the zip file
l_csv_file := apex_zip.get_file_content(l_zipfile,l_files(j));
-- Load data from current CSV file into temp data loading table
-- using 'athlete_session_data_csv' data loading definition
l_result := apex_data_loading.load_data(
p_data_to_load => l_csv_file,
p_static_id => 'athlete_session_data_csv');
/* See below for logic to associate data with athlete session */
end loop;
Moving Uploaded Data to Final Destination
After loading each CSV file into the “staging” table, I use the name of the file and the name of its directory to determine the date and time of the session as well as the athlete name. The split()
function in the apex_string
package comes in handy to break the file name into parts using the slash as the separator character. Assuming the file names will all be of the form athletes/AMINA/SESSION_2023_08_01_12_10.csv
, we can see that the athlete name is the second part of the fully-qualified file name and the session date and time is encoded in the third part. Notice I’ve used a date format mask that includes the literal text "SESSION_"
and ".csv"
to easily extract the date with no additional fuss.
for j in 1..l_files.count loop
/* See above for logic to load the current CSV file */
-- Extract the name of the athlete from the current file name
-- that we are assuming will be of the form:
-- athletes/AMINA/SESSION_2023_08_01_12_10.csv
l_filename_parts := apex_string.split(l_files(j),'/');
l_athlete_name := l_filename_parts(2);
l_session_date := to_date(l_filename_parts(3),
'"SESSION_"YYYY_MM_DD_HH24_MI".csv"');
/* See below for logic to lookup or create athlete */
end loop;
Using straightforward code, I lookup the current athlete’s id based on their name. If the athlete name does not exist, I create a new Athlete row for them and return the new athlete id into l_athlete_id
. Then I create a new athlete session for the athlete with the appropriate date and time extracted from the current CSV file name.
for j in 1..l_files.count loop
/* See above for previous logic in the loop */
-- Lookup the athlete id from their name. If not found, then
-- create a new athlete with this name and return the newly
-- assigned athlete id into l_athlete_id
begin
select id
into l_athlete_id
from eba_demo_athletes
where name = l_athlete_name;
exception
when no_data_found then
insert into eba_demo_athletes(name)
values(l_athlete_name)
returning id into l_athlete_id;
end;
-- Create a new athlete session for the athlete id
-- and session date (inferred from the CSV file name above)
insert into eba_demo_athlete_sessions(athlete_id, session_date)
values (l_athlete_id, l_session_date)
returning id into l_new_session_id;
/* See below for final logic to move uploaded data in place */
end loop;
Finally, I select the uploaded performance data from the “staging” table and insert it into the Athlete Session Data table along with the correct Athlete Session foreign key so it’s related to the new athlete session for the current file’s athlete.
for j in 1..l_files.count loop
/* See above for previous logic in the loop */
-- Finally, move all the uploaded performance data for the current
-- athlete's session into the newly created athlete session
-- using the l_new_session_id returned above
insert into eba_demo_athlete_session_data
(athlete_session_id, activity, value1, value2)
select l_new_session_id, activity, value1, value2
from eba_demo_uploaded_session_data;
end loop;
Enabling Background Processing
Using the techniques I described in Page Processing Power Unchained, it would be very easy to introduce an execution chain into the Home page’s processing tab and make the current Invoke API page process be its child process. You would accomplish that by setting the Execution Chain property of the latter to the name of the newly-created chain. By setting the new execution chain’s Execute in Background property to ON, its Temporary File Handling property to Move, and its Temporary File Items to P1_ZIP_FILE_UPLOAD
, the zip file CSV upload processing could be effortlessly done in the background while the user moves on to do other tasks. This extra flourish would be appropriate if there might be a large number of CSV files to process inside.
Trying Out the Sample App
You can download the APEX 23.1 sample app from here. This example athletes.zip file contains simulated training data for multiple sessions for four different athletes: BIFF, CHIP, ANNEENA, and WILMA. Try uploading the athletes.zip
file then browsing the Athletes Session Data page.