New to CloverDX? Need a refresher on the basics?
Start here for a high level overview of some of the core concepts you’ll find when designing and running your data jobs.
Let’s start with some definitions of terms you’ll come across in CloverDX:
In CloverDX, a ‘Project’ is a place where you group together multiple jobs or tasks that relate to each other.
Usually, one project corresponds to one use case, but if you’re a large organization maybe you’ll have one project per customer, or per department (such as Finance or Marketing). There’s no limit on the number of projects you can have.
The best guide for creating new projects is if you cannot figure out a name for your project (or only names like “Misc”, “Everything”, etc. come to mind) then you should split it into multiple projects that each cover only one area.
When you create a new project in CloverDX, you’ll see it has a predefined structure. You don’t have to use this structure though. We like to try and help you out by giving you a recommendation on how to start. You can freely change the layout but we recommend you don't stray too far from it to help your fellow data engineers understand the project quickly if they are trying to learn about it.
The CloverDX default project structure with number of directories and files:
For more on best practices when setting up a project in CloverDX, read the in-depth post over on our Tech Blog: Starting a new CloverDX project.
You’ll probably start by creating a local project. It will live just in CloverDX Designer, and you run it on your machine as you’re building it.
Note: Everything in CloverDX is just a readable text file (mostly xml), meaning it’s possible to figure out what a CloverDX job is doing just by looking at the files.
When you want to do something more complex or share your work with the team, you’ll typically deploy your project to a CloverDX Server “sandbox”.
One project corresponds to one sandbox on the Server.
At first glance, a sandbox on CloverDX Server is the same as a project in Designer – a directory of xml/text files.
But a Server sandbox adds more than you’ll find in a local project. You’ll get settings attached, e.g. access permissions which configure who has read/write access; runtime settings to configure if it can run in parallel/how many instances; and more.
And similarly to projects in Designer, you can have as many sandboxes as you like on your Server.
Some more terms you’ll come across quickly when you start using CloverDX:
One of the most important terms in CloverDX, a job is simply how you define the steps you want to perform with your data and how your data flows. There are a few types of jobs, but they all use the same concepts, so once you know how to create jobs of one type, you’ll know how to do them all.
You create jobs in CloverDX Designer – a development environment designed to help you create and manage your data jobs.
In the Designer UI you can see your job in the middle (consisting of components and edges); your project(s) on the left; and your palette of components on the right. At the bottom you have your console to see results when you run something – shown in green when everything’s fine, red when it’s not.
The best way to run jobs is on Server. The CloverDX Server tracks everything you do. If you run thousands of jobs, it’ll remember all of them – you can go back and look at the history of everything that happened. The Job Inspector module then allows you to see what the job itself looks like and how much data was processed. You can do everything you would expect from an environment designed for multiple users in an enterprise (e.g. manage permissions to ensure data security, and more).
You can run your jobs directly from Designer – even when they are stored on a Server. This allows you to use the powerful Designer interface to investigate and build your job quickly.
Let’s start with a simple example of a project you’re designing and running by yourself.
When you’re working in a team in a development environment, the process is basically the same – but typically you’re adding in version control.
*CloverDX integrates with all major version control systems. For a closer look at working with version control, watch our webinar: Effective Version Control and Teamwork in CloverDX
Taking things one step further, when you’re deploying to production, you’ll typically want to use some kind of DevOps automation in your process.
If you want more detail on any of this, the post Understanding the CloverDX Project Lifecycle over on our Tech Blog goes more in-depth on everything, including some best practices for deployment, development and team collaboration.