MySQL Federated connection, step-by-step

Posted on May 24, 2013


Federated engine can be used to push data from one database server to another. Lets say source server is called local server and target server will be remote.

1. Create table on remote server using a known engine. It is a good practice to define the engine here, although you might just use the default engine provided by the database. However, when you define it, it is easier to fix the problem of different engines.

CREATE TABLE RemoteSpecies (
speciesId INT NOT NULL
, speciesName VARCHAR(250)
, speciesLatName VARCHAR(250)
, observerName VARCHAR(250)
, observationTime INT
, PRIMARY KEY (speciesId)

Note the absence of AUTO_INCREMENT in the primary key of this table. This table will be updated through Federated system, so there is no need to ever insert anything.

2. Switch to local server and create a server connection.

OPTIONS (USER ‘remote_server_mysql_user’, HOST ‘remove_server_ip’, PORT 1234, DATABASE ‘remote_server_database’);

You would like to create a special user in your remote system just for this. This user can then be set to have access only on the remote table defined in step 1. Setting up the port can be a bit of a work. Default MySQL port is 3306, many routers and firewalls have ready made procedures of how to open this port for MySQL calls. See one solution using bind-address and ufw.

Just a note, it is possible to ALTER server connections just like they were tables.


3. On the local server, create the federated table.

, speciesName VARCHAR(250)
, speciesLatName VARCHAR(250)
, observerName VARCHAR(250)
, observationTime INT
, PRIMARY KEY (speciesId)

Test by inserting content. Anything you do in local table Species will have immediate effect in RemoteSpecies table. If not, your server connection is not working (check firewalls too, you can telnet them for testing) or local and remote table structures are not identical. The tables may have different name, but otherwise they have to be identical, down to the database engine. See more about the engine issue on previous post.

Further development: synchronize two databases using one-way connection with federated engines.

Lets say two applications, like some back-end warehouse system and a company website, are using the same data. However, the data is saved in two different servers, and IT department requires that no connections are opened from website to warehouse database and likewise. Also, not all information on warehouse system should be public, so a selection of data for publishing has to be made. In the healthy case of paranoid security rules, no secret information is allowed to be transferred on web server in any circumstances. Combination of views, federated engines and cron jobs will solve all of these requirements.

4. Lets continue from above by creating a view on the local server. This view contains the public data, which should be made available on the remote server. You can compose the view in any way you wish, possibly creating several views and remote-federated table pairs.

At this point you might want to take AUTO_INCREMENT off on the table created in step 3. No need for that anymore as the data will be filled from the view.

One more thing to consider before creating any views. Do you want to set separate databases for transferring? This might be a good idea, to keep your moving data separate from any databases that are actually doing modifications to the data all the time. At least in human mind it keeps things in good order.

Lets say separate databases for moving data is the way to go. If you already didn’t create the federated table in step three on it’s own database, create the database now and put the federated table in it. I’ll call this database by name “MovingData”. Now for the view. This is created inside moving database. It takes advantage of referencing data from one database to another within a same server (ApplicationDatabase to MovingData). To continue with the birds theme, this view only picks the parts that are needed on the remote server. For example, we could have nature preservation organisation website, where we want to display some information about the species encountered in the area. View takes bird names, name of the observer and observation time, but leaves out sensitive information like the exact location of the bird. This example is a bit more complicated than select in its basic form, but the idea is to show that using views is the perfect solution for gathering the needed information from several sources for single transfer.

CREATE VIEW Fed_view_Species AS
, species.speciesName
, people.observerName
, observation.observationTime
FROM ApplicationDatabase.Observations observation
LEFT JOIN Species species ON species.speciesId = observation.speciesId
LEFT JOIN People people ON people.personId = observation.observerId

5. Create a script to transfer data on federated table. Here we’ll do a automated job, which empties the table created in step 3 and immediately fills it with the data on the view created in step 4. Again you should consider which mysql user should be used to do this. I have created a user on local server mysql just for this.

With Linux or similar Unix system you can first try this from the command line.

mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P MovingData -e ‘TRUNCATE Species; INSERT INTO Species SELECT * FROM Fed_view_Species’;

Note that syntax for this command requires you to type the password with -p selector, and the password needs to be written after the -p without space, = or “” marks. For example, if the password is 123/HgK, you’ll type it here -p123/HgK. After the password, there comes the database name without selector, here that is “MovingData”. Selector -e and ” is for the database query. If this works from command line, put it in crontab or what ever automated system your server is using. For crontab it would be for example to refresh data once per day at 1:25 AM:

25 1 * * * mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P MovingData -e ‘TRUNCATE Species; INSERT INTO Species SELECT * FROM Fed_view_Species;’

I have seen on some systems -e selector does not allow you to insert two queries at once. You might need to try to do it with two lines like this:

25 1 * * * mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P MovingData -e ‘TRUNCATE Species;’
26 1 * * * mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P MovingData -e ‘INSERT INTO Species SELECT * FROM Fed_view_Species;’

Note that this will leave one minute gap where the remote server has no data. See next step.

6. Think about the setup on remote end. On local end, we considered if moving data and tables related to it, the view and the federated, should be installed on their own in an empty database. Why not to do that on the remote end as well? That’ll keep your incoming data nicely on it’s own area, where you can take it further for the need of the web site or what ever the remote system is. Also this removes the problem of empty data, in case you need to divide the cron job in two lines, or when your system is so busy that it can accept no interruptions from rebuilding data like truncate-insert does even when run at the same time. Lets make the decision to do a dedicated database for imported data, called “Incoming”.

In step 2, we created a server connection. With the connection, the remote database name was defined. If you’d move the remote table on this database, create the database and the table from step 1 there, then alter database name for the server connection to Incoming like explained in step 2.

ALTER SERVER fedlink OPTIONS (HOST ‘Incoming’);

Catch, you need to run this on local server, not on remote.

When you get the Federated connection to work from MovingData database to Incoming database, create a view in Incoming.

CREATE VIEW Fed_view_Species AS
SELECT * FROM Species;

This view now shows the data in remote table. The effort doing a view from this might not be needed, but it gives a bit more security over your system as the remote data will only be used through this view. Also here you easily can further limit the data, or to write it open in several tables again, if needed. Then create a real table on remote server, preferably not in Incoming database but in the real database, that one that your web site or what ever application is using. Lets call this table frozen, as there the data will be frozen always for a day, then refreshed once during the night.

CREATE TABLE Species_frozen (
, speciesName VARCHAR(250)
, speciesLatName VARCHAR(250)
, observerName VARCHAR(250)
, observationTime INT
, PRIMARY KEY (speciesId)

Last thing to do is to create a similar automated job on remote server that we did in step 5 on local server. This will refresh the frozen table once per day. It is important to time this so that the refresh happens after the local server has completed it’s automated jobs. In the example above, we set the local server update on 1:26 AM. If you have big tables, it is good to leave enough time for server to finish the queries in the federated end before starting the refreshing script for frozen table. One minute is certainly enough in many cases, but just to be sure, you could leave an hour or so. For your web site, data will be available all the time on the frozen table anyway.

25 2 * * * mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P Incoming -e ‘TRUNCATE Species_frozen; INSERT INTO Species_frozen SELECT * FROM Fed_view_Species;’

If you need to have data available on this table with no gaps, not even the half a second it takes to run this cron job, or when you are moving huge chucks of data, use two temporary tables and SQL rename command. The script to do that would look something like this:

25 2 * * * mysql -u LocalMysqlUser -pPASSWORDHERENOSPACEAFTER-P Incoming -e ‘DROP TABLE IF EXISTS `Species_temp`; CREATE TABLE `Species_temp` SELECT * from `Fed_view_Species`; RENAME TABLE `Species` TO `Species_old`, `Species_temp` TO `Species`; DROP TABLE IF EXISTS `Species_old`;’

This technique is sometimes referenced as materialized view. Big commercial database systems like Oracle have this implemented in their system with very easy interface, we open source users need to run three more commands to achieve the same.