MariaDB Connect CSV Engine

Goal

Connect to a csv - a delimiter separated file, and xml file to run sql directly on it. This is to facilitate the importing of data feeds into mysql.

Updated: Thurs, 2017-02-02. When moving the csv example from my test server to production, the example gave a permission error. After much internet searching and staring at mariadb’s knowledgebase, https://mariadb.com/kb/en/mariadb/grant/#global-privileges, I determined that the user must have global FILE privileges, to be able to interact with the csv file. Note, that they can have reduced privileges for actual work on specific databases and tables. This will, hopefully, prevent any security issues.

Note: I consider mariadb’s documentation to be a good reference but, it needs more sophisticated examples, especially with the CONNECT engine. For example, some create table definitions have 'not null' and others 'default null'. Why? Can I create a csv table with the same column definitions and constraints, as a mysql table? If not, what are the differences? The article 'CONNECT Data Types,' https://mariadb.com/kb/en/mariadb/connect-data-types/, attempts to help you, but it also needs more examples.

Sample Data

Here are the columns of data in the files:

  • person_id

  • first_name

  • middle_name

  • last_name

  • nexus [id]

  • uwdir [id]

Install the CONNECT engine:

As a plugin (2):

  • _INSTALL PLUGIN CONNECT SONAME 'haconnect';

  • SHOW PLUGINS;

  • SHOW ENGINES;

Add the FILE privilege:

With your superadmin user (8):

  • _SHOW GRANTS FOR 'mysqluser'@'localhost';

  • _GRANT FILE ON . TO 'mysqluser'@'localhost';

  • _SHOW GRANTS FOR 'mysqluser'@'localhost';

CSV Procedure

Grant the mysqld process full access rights to the file (1):

  • copy the file to the MySQL data directory

  • ensure it is owned by the user and group 'mysql'

  • remove the top line with the column names if you need to sort the file by its keyed columns; before exporting sort in the order that you want the data in to keep the header

  • create the connect table in your mysql interface of choice

DROP TABLE IF EXISTS person_csv;
CREATE TABLE IF NOT EXISTS person_csv (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=csv file_name='/usr/local/var/mysql/person.csv' header=1 SEP_CHAR='\t' quoted=1;

To test file access, run a basic query on it:

  • _SELECT * FROM personcsv;

You can add data to the file:

  • _INSERT INTO person_csv (person_id, first_name, middle_name, lastname, nexus, uwdir) VALUES (123456, 'Stu', 'Art', 'Dent', 's200dent', 's200dent');

Now, try updating:

  • _UPDATE person_csv SET uwdir = 's201dent' WHERE personid = 123456;

Try deleting:

  • _DELETE FROM person_csv WHERE personid = 123456;

PIVOT Procedure

With my present understanding of this feature, it appears that this table type is being used to extend mysql’s sql command set so that it can do pivot tables. Here is their explanation[5]:

This table type can be used to transform the result of another table or view (called the source table) into a pivoted table
along 'pivot&' and 'facts' columns. A pivot table is a great reporting tool that sorts and sums
(by default) independent of the original data layout in the source table.
...

The CONNECT implementation of the PIVOT table type does much of the work required to transform the source table:

    Finding the 'Facts&' column, by default the last column of the source table.
    Finding the 'Pivot&' column, by default the last remaining column.
    Choosing the aggregate function to use, 'SUM' by default.
    Constructing and executing the 'Group By' on the 'Facts' column, getting its result in memory.
    Getting all the distinct values in the 'Pivot' column and defining a 'Data' column for each.
    Spreading the result of the intermediate memory table into the final table.

If a different result is desired, Create Table options are available to change the defaults used by Pivot.

XML/HTML Procedure

Some points [4]:

  • XML is case sensitive and thus use the appropriate case when creating the table

  • column order can be different from node order

  • can select the field nodes using xpath

  • the at sign (@) will select a tag attribute

  • xml tables can be sorted and joined to

  • there is a way of specifying how to read multiple nodes with the same tags, i.e. a many relationship

  • multiple nodes and intermediate multiple nodes complicate writing, i.e. updating and inserting, greatly

    • if it is simple, I would write back to the base file; otherwise, an xml export might be a better way to generate the correct xml

  • Read reference 2 because parsing the data can be quite complicated

Phpmyadmin’s xml export of the sample data, for some reason, defined what CONNECT would view, as the table, as the database name. Also, each row is given the table name (ok, this sort of makes sense). Here is the xml CONNECT table definition taking into account the previous info, and referencing the fields using xpath:

This produces one blank record.

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml' tabname='ofis_17';

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml' option_list='Coltype=@';

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml'
tabname='ofis_17' option_list='Coltype=@';

This produces a table with no record.

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml'
option_list='rownode=person,Coltype=@';

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL,
   first_name varchar(50) NOT NULL DEFAULT '',
   middle_name varchar(50) NOT NULL DEFAULT '',
   last_name varchar(50) NOT NULL DEFAULT '',
   nexus varchar(8) NOT NULL DEFAULT '',
   uwdir varchar(60) NOT NULL DEFAULT ''
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml' option_list='rownode=person';

This produces a table creation error.

DROP TABLE IF EXISTS person_xml;
CREATE TABLE IF NOT EXISTS person_xml (
   person_id int(10) NOT NULL field_format='@person_id',
   first_name varchar(50) NOT NULL DEFAULT '' field_format='@first_name',
   middle_name varchar(50) NOT NULL DEFAULT '' field_format='@middle_name',
   last_name varchar(50) NOT NULL DEFAULT '' field_format='@last_name',
   nexus varchar(8) NOT NULL DEFAULT '' field_format='@nexus',
   uwdir varchar(60) NOT NULL DEFAULT '' field_format='@uwdir'
) ENGINE=CONNECT DEFAULT CHARSET=utf8 table_type=xml file_name='/usr/local/var/mysql/person.xml'
tabname='ofis_17' option_list='rownode=person';

To test file access, run a basic query on it:

  • _SELECT * FROM personxml;

I tested the xml connection using the knowledgebase tutorial and the examples worked. This suggest to me that the xml produced by phpmyadmin was malformed, hence, the above tables not working.

References