How Data Pipelines work

New

We make it easy to connect data from sources to destinations—including Journeys—so that you can take advantage of your data, no matter where it comes from or where you want it to go.

Why use Customer.io Data Pipelines?

Data Pipelines serve two major purposes in your stack:

  • We make it easy to connect your data sources to the destinations where you activate your data, so you don’t have to spend time and resources maintaining your own complex integrations.
  • Data Pipelines prevent you from getting locked into particular vendors by giving you access to, and complete control of, your data.

You’ll set up Sources—like your website or data warehouse—places where you create and house data. Then we’ll transform and route that data to any number of Destinations where you can store, utilize, and act on your data. You can even use Customer.io Journeys as a destination. This lets you capture data from any number of sources on people you can message, events that trigger campaigns, and so on.

All data sources provide data in the same shape and format. No matter what sources you connect, you’ll always know what calls you’ll need to make and what output you should expect. We also provide a set of default actionsThe source event and data that triggers an API call to your destination. For example, an incoming identify event from your sources adds or updates a person in our Customer.io Journeys destination. for each destination, which automatically reformat your source data to fit the destination. So, unless you want to customize the way we map data to your destinations, you won’t need to learn your destinations’ APIs; and even if you do want to customize actions, we provide them in plain text, so it’s easier to understand and manipulate than your standard API call.

flowchart LR subgraph Sources a(Your Website) b(Server-side data) end c((Customer.io)) subgraph Destinations d(Your CRM) e(Your analytics
platform) f(Customer.io
Journeys) end a-->|JS integration|c b-->|Go, Python, or
Node integration|c c-->|Send website and
server-side data|d c-->|Send website source only|e c-->|Send website and
server-side data|f linkStyle 0,3 stroke-width:2px,fill:none,stroke:#AF64FF linkStyle 1 stroke-width:2px,fill:none,stroke:#00ECBB linkStyle 2,4 stroke-width:2px,fill:none,stroke:#0597AD

The ability to resend data prevents vendor lock-in

While customer data platforms make it easy to connect new sources and destinations relatively easily, that’s only one part of preventing vendor lock in. One of the reasons people get locked into particular vendors is historical data—you don’t want to lose data that you’ve stored in a particular service, and it may not be easy to export that data to a new service.

But Customer.io stores your source data indefinitely. If you’re on a premium or enterprise plan, you can resend your historical data to new destinations. This means that when you want to move from one service to another, you can add a new destination, connect your sources, and replay your old data to the new destination. That way, every new destination has access to all of your historical data, so you don’t have to start fresh and can use the services that best fit your business.

flowchart LR a(website)-->|real time
source data|c b(server-side source)-->|real time
source data|c subgraph c [Data Pipelines] direction LR z(Real time data) y(historical data) end z-.-x|disconnect old
destination|d(old destination) z-->|connect real time
source data|e(new destination) y-->|replay old data|e

Connections: integrations at a glance

Data Pipelines helps you connect the data you collect (data sources) to places where you activate or use that data (destinations). Each source can connect to multiple destinations and each destination can have multiple sources, which can make it hard to trace your integrations.

We’ve designed the Connections page to help with that. This page gives you an overview of all your sources and destinations—including quick indicators for errors and links to the most important settings. You can hover over a source or destination to see what it’s connected to.

the connections page, hovering over a javascript source and showing it connected to several destinations
the connections page, hovering over a javascript source and showing it connected to several destinations

Sources, on the left, shows whether you’re actively receiving data for a source. You can click to show the most recent data for that source. This gives you a quick way to check up on your data sources, make sure that you’re sending the right data into your pipeline.

Destinations, on the right, shows whether there are any errors and links you to your actionsThe source event and data that triggers an API call to your destination. For example, an incoming identify event from your sources adds or updates a person in our Customer.io Journeys destination.—the things that cause us to send data to your destination. Click to go to a destination’s actions and fix errors or change the way we forward data to your destination.

The Data Pipelines connections page showing an error for the Salesforce destination
The Data Pipelines connections page showing an error for the Salesforce destination

Sources

A source is a website, service, mobile app, or anything that you want to capture data from—it’s a “source” of data!

When you set up a source, you’ll install one of our source libraries, and use that to send data to Data Pipelines. Then you can route that data to one or more destinations. Today we support the following types of sources:

  • JavaScript: In general, you’ll want to install our JavaScript source in your website(s). This client-side library is easiest way to gather source data in Customer.io.
  • Server-side libraries (NodeJS, Python, Go) help you send data directly from servers when you can’t gather data from your client.

Your workspace is automatically a source

If you use Customer.io Journeys, we’ll automatically send Customer and Message events into Data Pipelines, so you can forward this information into destinations. By default, we send all possible messaging events into Data Pipelines.

Destinations

A destination is some location that you want to send data to. We’ll automatically transform data to fit the shape that each destination expects, making it easy for you to send data from one or more sources to a destination without writing your own code.

We have a whole catalog of destinations for your data. Check it out!

Your workspace can be a destination

You can feed data from various sources directly into your workspace so you can take advantage of Customer.io JourneysCustomer.io’s messaging product, where you can add people, send messages, and set up automated campaigns. If you’re a longtime user, Journeys represent Customer.io’s total featureset before we added Data Pipelines to the mix.. As a destination, you can add people to your workspace and send events to Customer.io Journeys based on data from outside sources without setting up a complicated integration with a third party product.

Customer.io can be a destination for data
Customer.io can be a destination for data

Data Residency: US and EU regions

Data Pipelines works in both our US and EU regions. While your region accounts for the location of your Customer.io data, it doesn’t account for your sources or destinations. You could be in our EU region but send data to a downstream destination in the US, and vice versa.

Some destinations, like Mixpanel, have regional settings that you can configure to make sure that your data stays in the region you want it to. When you set up destinations, make sure they store data in the right region for you.

Copied to clipboard!
  Contents
Is this page helpful?
Chat with AI