Sync people from a database

Import and update people from a SQL database, making sure that people in your workspace reflect the latest information from your CRM or another backend system.

Connecting your workspace to a MySQL or Postgres database provides a simple way to add and update people on a recurring interval. You can also set your sync to add people to, or update people in, a manual segment, to automatically trigger campaigns on each sync interval.

Requirements

We support MySQL and Postgres databases (including Amazon Redshift). To sync from a Redshift instance, use our Postgres integration. Contact us if you want to sync with a different database or data warehouse, like BigQuery or Snowflake.

If you use a firewall or an allowlist, you must allow the following IP addresses (corresponding to your account region—US or EU), so that we can connect to your database.

Account region IP Address
US 34.122.196.49
EU 104.155.37.221

Query Requirements

When you create a database sync, you provide a query selecting the people and columns you want to import. Each row returned from your query is a person you’ll add or update; each column is an attribute that you’ll set for each person.

Your query must:

  • Enumerate the columns you want to import. You cannot SELECT *.
  • Select at least one column representing an identifierThe attributes you use to add, modify, and target people. Each unique identifier value represents an individual person in your workspace. in your workspace (email or id).
  • Be limited to 2,000,000 rows and 50 columns, where each row represents a person and each column represents an attribute. Contact us if you want to import more than two million records per sync. See optimize your query for help limiting your query.

 Attributes are case sensitive

While attributes in Customer.io are case sensitive, SQL queries are not. Unless you use AS in your query, we use column names in your query as attribute names in Customer.io. If the column case does not match an attribute in your workspace, you may end up with duplicate attributes, i.e. email and Email!

Case column names in your query in the same format as attributes in your workspace. Use As if you want to map a column to an attribute with a different name.

Select Primary_Email AS email

Add a sync

If you use a firewall or an allowlist, you must allow the following IP addresses (corresponding to your account region—US or EU), so that we can connect to your database.

Account region IP Address
US 34.122.196.49
EU 104.155.37.221
  1. Go to Data & Integrations > Integrations and select the database you want to sync from—MySQL or PostgreSQL. If you want to sync from an Amazon Redshift instance, select the PostgreSQL option. You can search for your database type or click Databases to find it.
  2. Click Create Sync
  3. Enter a Name and Description for the sync and click Sync settings. These fields help you identify sync and are purely informational.
  4. Set your sync settings and click Select database.
    1. How often should this import sync? You can sync your database down to the minute, but large, successive sync operations may slow your workspace.
    2. How do you want to identify people? Select whether to add and/or update people. If your workspace supports both email and ID as identifiers, select the identifier you’ll use to identify people—email or id.
    3. Sync these people to a segment?: As a part of each sync, you can add people to a new or existing segment. Use Create a new segment to set up a new segment specifically for your sync and Sync to an existing segment to add people to another segment in your workspace.
    SQL sync settings
    SQL sync settings
  5. Enter your database credentials and click Add database. We suggest that you use read-only credentials for your database.

    While we don’t write to your database, using read-only credentials ensures that you can’t inadvertently make changes to your database through your query.

    When you add your database, we’ll try the connection to make sure your settings are correct. When you’re done, click Write query to move to the next step. If you added a database as a part of another sync operation, you can select it instead of adding a new database.

  6. Enter your SQL query and click Run query to preview up to 100 rows of results.

    Your query:

    • Must SELECT individual columns. You can use SELECT * in a test query to see your available columns, but you cannot save your sync until you select individual columns.
    • Must include at least one of id or email depending on the IdentifiersThe attributes you use to add, modify, and target people. Each unique identifier value represents an individual person in your workspace. in your workspace.
    • Should include a WHERE clause, comparing recent updates against the {{last_sync_time}} (Unix epoch) to limit syncs to the most recent updates.

     Use SELECT * to see available columns

    While your sync must select individual columns, you can use SELECT * in the Query step to preview the first 100 rows, and all available columns, from your database. This can help you determine which columns you actually want to select.
    input your sql query
    input your sql query
  7. Click Review import to review your sync setup.
  8. Click Create sync to start the import process.

Check the status of a sync

You can click any of your sync operations to see the status of a sync operation, how many people were added or updated, and when the sync started.

  1. Go to Data & Integrations > Integrations and select the database you want to sync from—MySQL or PostgreSQL. You can search for your database type or click Databases to find it.
  2. Click the sync you want to check the status of.

Pause or resume a sync

Pausing a sync lets you skip sync intervals, but doesn’t otherwise change your configuration. If you resume a sync after you pause it, your sync will pick up at its next scheduled interval.

  1. Go to Data & Integrations > Integrations and select your MySQL or Postgres integration—whichever has contains the sync you want to pause or resume.
  2. Click next to the sync you want to modify and select Pause. If your sync is paused and you want to resume it, click Activate.
pause and resume syncs on the MySQL page
pause and resume syncs on the MySQL page

Update a sync

When you update or change the configuration of a sync, your changes are reflected on the next sync interval.

  1. Go to Data & Integrations > Integrations and select your MySQL or Postgres integration—whichever has contains the sync you want to update.
  2. Click the sync you want to update.
  3. Make your changes. Click between Query and Settings tabs to make changes to different aspects of your sync.
  4. Click Save Changes.
Click a sync to change settings and see details
Click a sync to change settings and see details

Delete a sync

Deleting a sync stops syncing/updating people from your database using a particular query. It does not delete or otherwise modify anybody you imported or updated from the database with that query.

  1. Go to Data & Integrations > Integrations and select your MySQL or Postgres integration—whichever has contains the sync you want to delete.
  2. Click next to your sync and select Delete.

Optimize your query

Because your database sync operates on an interval, you should optimize your query to ensure that we import the right information, quickly, with the least noise. When setting up your query, you should consider:

  • Your database timeout value: Queries selecting large data sets may timeout.
  • Cost: Are you charged per query or for the amount of data returned?
  • Can you narrow your query?: Add a “last_updated” or similar column to tables you import, and index that column. You’ll use this column to select the changeset for each sync.
SELECT id, email, first_name, created AS created_at
FROM my_table
WHERE last_updated > {{last_sync_time}}

Last Sync Time

We strongly recommend that you index a column in your database representing the date-time each row was last-updated. When you write your query, you should add a WHERE clause comparing your “last updated” column to the {{last_sync_time}}.

The last sync time is a Unix timestamp representing the date-time when the previous sync started. Comparing a “last-updated” column to this timestamp helps you limit your sync operations to the columns that changed since the previous sync.

If you use ISO date-times, you can convert them to unix timestamps in your query.

SELECT id, email, first_name, created AS created_at
FROM my_table
WHERE UNIX_TIMESTAMP(last_updated) > {{last_sync_time}}
SELECT id, email, first_name, created AS created_at
FROM my_table
WHERE extract(epoch from last_updated) > {{last_sync_time}}

Mapping columns to attributes

We map column names in your query to attributes in your workspace, exactly as formatted in your query. However, SQL is not case sensitive: if a column in your database is called Email, you can use SELECT email to map the column to the email attribute in your workspace.

You can re-map column names from your query to attributes in Customer.io using AS.

SELECT id, email, primary_phone AS phone
FROM people
WHERE updated > {{last_sync_time}}

Sync intervals

You can set your workspace to import from your database on a basis of minutes, hours, days, etc. Be mindful of the time frame that you set, in accordance with the general number of people you expect to import with each sync, and how often you truly need to update people in your workspace.

If a sync is still running when the next sync interval is scheduled to begin, we’ll cancel the next sync interval. If this continues to happen, you should change your interval so that sync operations don’t overlap.

We tested sync performance for a MySQL server against an empty workspace with no concurrent operations (API calls, running campaigns, etc) with the following results. Your results will vary if your query is more complex, or your workspace has multiple concurrent, active users during the sync.

You should adjust sync intervals so that you do not sync more often than the average sync time below, and we strongly recommend that you significantly buffer sync intervals to account for concurrent users in your workspace and additional operations (active campaigns, segmentation, and other operations that affect your audience).

Database rows Database columns Average sync time (mm:ss)
100,000 10 4:20
250,000 10 10:36
500,000 10 21:49
750,000 10 31:22
1,000,000 10 40:39

Errors in the sync process

Sync operations update people and attributes wherever possible. Errors in syncs are reflected in the Activity Log or on a specific person’s Recent Activity.

In general, most issues are of the Failed Attribute Change type relating to changes to id or email IdentifiersThe attributes you use to add, modify, and target people. Each unique identifier value represents an individual person in your workspace.. You are likely to see this error if:

You set an id or email value that belongs to another person. If your workspace identifies people by either email or id, these values must be unique. Attempting to set a value belonging to another person will cause an error.
You attempt to change an id or email value that is already set for a person. You can set an id or email if it is blank; you cannot change these values after they are set in a Sync. You can only change these values from the People page, or when you identify people by cio_idAn identifier for a person that is automatically generated by Customer.io and cannot be changed. This identifier provides a complete, unbroken record of a person across changes to their other identifiers (id, email, etc).), which you cannot use in a Sync.
You set an invalid email value Emails must conform to the RFC 5322 standard. If they do not, you’ll receive an attribute change failure.

FAQ

What databases do you support? We support MySQL and Postgres databases (including Amazon Redshift). Contact us if you want to sync with a different database or data warehouse, like BigQuery or Snowflake. To sync from a Redshift instance, use our Postgres integration.
Do you support SSL or TLS connections? No, but we’re working on adding encrypted connections in a future release. In the meantime, secure your connection by limiting access to approved IP addresses.
Is there a limit to the number of people I can sync at a time?

You cannot add or update more than 2,000,000 people (rows) at a time. Consider adding a LIMIT and ORDER BY to your query, or using a WHERE clause to limit updates to people who have been added or updated since the {{last_sync_time}}. See optimize your query for more information.

Your query cannot SELECT more than 50 columns, where each column represents an attribute.

Contact us if you want to import more rows or columns.

Copied to clipboard!