Creating cron jobs in node.js: a real-life example using BambooHR

Do you have a requirement that you need to run some kind of process every X number of hours? Wondering on creating scheduled jobs in Node? If that’s why you are here, then this post will work for you.

This time, I’m going to write about node-cron. An NPM package used to schedule tasks that will execute in certain periods of time defined by cron expressions. Let’s start with some basics:

What’s a cron expression?

cron expression is a string containing some subexpressions that describe the details of the schedule that you want to create. Every subexpression is separated by a white space and have a limited amount of options to be set. The cron expression is defined from left to right, and it can contain from 5 to 7 subexpressions (fields from now on).

The library we selected, works with cron expressions from 5 to 6 fields and it works like this:

  • The first option is to set a scheduler with seconds. This field is optional and is only used when the cron expression has 6 fields. It accepts values from 0 to 59, or the wildcard ( * ).

  • The second option is used for minutes. It accepts values from 0 to 59, or the wildcard ( * ).

  • Third for hours. It accepts values from 0 to 23, or the wildcard ( * ).

  • Fourth for day of month. It accepts values from 1 to 31, or the wildcard ( * ).

  • Fifth for month. It accepts values from 1 to 12, the names of the months, or the wildcard ( * ).

  • Sixth and last for Day of week. It accepts values from 0 to 7, the name of each day, or the wildcard ( * ).

Besides the values accepted, each subexpression can have special operators that allows for more complex scenarios, for example:

  • Run every minute 10th and 20th minute: 10,20 * * * *

  • Run every 2 hours: * */2 * * *

  • Run every Sunday: * * * * Sunday

About node-cron

As mentioned before, we are using node-cron. An NPM package with more than 50,000 downloads weekly, and currently, as the time of this post, on version 2.0.3. In Github, it has 713 stars, 10 contributors and 20 releases since February 2016, which was the first release.

Since we are going to work in Typescript, I suggest also to install the types package for node-cron. You can install it by running:

npm install @types/node-cron —save-dev

Setting up node-cron

Creating a scheduled task with node-cron is a really easy task, and actually the basic examples from the documentation of the package explain it really well. Here is one of the examples from the page:

var cron = require('node-cron');
 
 cron.schedule('0 1 * * *', () => {
   console.log('Runing a job at 01:00 at America/Sao_Paulo timezone');
 }, {
   scheduled: true,
   timezone: "America/Sao_Paulo"
 });

However, this example falls short if you work with a more real-life scenario; like retrieving information from a datasource, manipulate it and then insert it into another database. This is a typical case of a process that needs to be done when you are doing some kind of synchronization between two systems. And actually, today we are doing that same example.

Our use case will be to retrieve information from a system called BambooHR (used to manage employees of a company, salaries, vacations, etc), compare it with data from another system and then insert, update or delete the differences. So let’s start first with the cron job.

The cron job

We are going to create first a class that will contain all the logic of the tasks that will be run, for our case it will be called BambooCron. Here is the code for it:

import { schedule, ScheduleOptions, ScheduledTask } from 'node-cron';
import { parseExpression } from 'cron-parser';
import _ from 'lodash';
import moment from 'moment';
import { BambooService } from '../data-access/bamboo/bamboo.service';
import { UserService } from '../api/services/user.service';
import { TimeOffService } from '../api/services/timeOff.service';
import { IHumanResourceManagerService } from '../data-access/IHumanResourceManagerService';

export default class BambooCron {
    private options: ScheduleOptions = {
        scheduled: false
    };
    private task: ScheduledTask;
    private bambooService: IHumanResourceManagerService;
    private usersService: UserService;
    private timeOffsService: TimeOffService;

    constructor() {
        this.task = schedule(process.env.CRON_EXPRESSION
            , this.executeCronJob
            , this.options);
    }

    public startJob() {
        this.task.start();
    }

    private executeCronJob = async () => {
        const format = 'YYYY-MM-DD hh:mm:ss';
        console.info(`Starting cron job at: ${moment().format(format)}`);

        this.usersService = new UserService();
        this.bambooService = new BambooService();
        this.timeOffsService = new TimeOffService();
        await this.processEmployees();
        await this.processTimeOff();

        const cronDate = parseExpression(process.env.CRON_EXPRESSION).next();
        console.info(`Finished cron job. Next iteration at: ${moment(cronDate.toDate()).format(format)}`);
    }

    private async processEmployees() {
        const employees = await this.bambooService.getEmployees();
        const users = await this.usersService.getAllUser();
        const usersToAdd = _.differenceWith(employees, users, (employee, user) => {
            return employee.id === user.bambooId;
        });
        const usersToDelete = _.differenceWith(users, employees, (user, employee) => {
            return employee.id === user.bambooId;
        });
        usersToAdd.forEach(async (employee) => {
            await this.usersService.saveUser(employee);
        });
        usersToDelete.forEach(async (user) => {
            await this.usersService.removeUser(user);
        });
    }

    private async processTimeOff() {
        const bambooTimeOffs = await this.bambooService.getTimeOffs();
        const dbTimeOffs = await this.timeOffsService.getAllFromProvider('bamboo');
        const users = await this.usersService.getAllUser();
        const timeOffsToAdd = _.differenceWith(bambooTimeOffs, dbTimeOffs, (bambooTimeOff, dbTimeOff) => {
            return bambooTimeOff.id === dbTimeOff.bambooId;
        });
        const timeOffsToDelete = _.differenceWith(dbTimeOffs, bambooTimeOffs, (dbTimeOff, bambooTimeOff) => {
            return bambooTimeOff.id === dbTimeOff.bambooId;
        });
        timeOffsToAdd.forEach(async (timeOff) => {
            const user = users.find(x => x.bambooId === timeOff.employeeId);
            if (user)
                await this.timeOffsService.saveTimeOff(timeOff, user.userNm);
        });
        timeOffsToDelete.forEach(async (user) => {
            await this.timeOffsService.removeTimeOff(user);
        });
    }
}

Let’s explain this class by sections. First, the constructor is where the task is going to be scheduled. The method schedule, imported from node-cron, receives 3 parameters: the cron expression that is being retrieved from the environment file, then the callback to the job code and lastly, some options of the scheduler (in out case, the only option we set is that it won’t start immediately).

The method startJob is a simple one, since we specify that the job is not going to start as soon as we schedule it, we need to have a way to start it programmatically.

The following method is executeCronJob, here is where everything happens, at least from a high level. From here, we initialize all the services that we are using to retrieve or insert information and also we print some information messages to the console like the time the task is running and when will be the next time the job runs.

The next two methods are similar but works for different entities, so let’s explain the flow for each one. The first step is retrieve all the information needed by calling methods from the services instantiated in the executeCronJob method. Then, we compare the data using lodash’s differenceWith method (another famous package). And finally, from the arrays created, we either delete or add information to the database by calling the services again (no updates are being managed in this example).

A big design improvement

As I’m writing this post, I’m noticing that the methods processEmployees and processTimeOff are, in essence, the same thing. So they can be abstracted to another method that encompasses the implementations. Feel free to design it differently.

The bamboo service

Now, we are going to work with the service that retrieves information from bamboo.

import fetch from 'node-fetch';
import moment from 'moment';
import { IHumanResourceManagerService } from '../IHumanResourceManagerService';
import { Employee } from './employee';
import { VacationTimeOff } from './vacationTimeOff';

export class BambooService implements IHumanResourceManagerService {
    private bambooHeaders = {
        method: 'GET',
        headers: { 'Accept': 'application/json' }
    };

    private getBaseUrl(endpoint) {
        const key = process.env.bambooKey;
        const baseEndpoint = ':x@api.bamboohr.com/api/gateway.php';
        const subdomain = process.env.bambooSubDomain;
        return `https://${key}${baseEndpoint}/${subdomain}/v1/${endpoint}`;
    }

    public async getEmployees(): Promise<Employee[]> {
        const url: string = this.getBaseUrl('employees/directory');
        try {
            const response = await fetch(url, this.bambooHeaders);
            const directory = await response.json();
            return directory.employees
                .filter(employee => employee.workEmail)
                .map((employee) => {
                return {
                    ...employee,
                    id: parseInt(employee.id),
                }
            });
        } catch (error) {
            throw error;
        }
    }

    public async getTimeOffs(): Promise<VacationTimeOff[]> {
        const today = moment();
        const startDate = today.format('YYYY-MM-DD');
        const endDate = today.add(3, 'M').startOf('month').format('YYYY-MM-DD');
        const url: string = this.getBaseUrl('time_off/requests/?status=approved&start=${startDate}&end=${endDate}');
        try {
            const response = await fetch(url, this.bambooHeaders);
            const timesOff = await response.json();
            return timesOff.map((timeOff) => {
                return {
                    ...timeOff,
                    id: parseInt(timeOff.id),
                    employeeId: parseInt(timeOff.employeeId),
                };
            });
        } catch (error) {
            throw error;
        }
    }
}

Again, let’s review this by sections. First, we create some reusable headers and a getBaseUrl method. This method will create the URL that will be used to connect to Bamboo; this URL is created by reading some configurations from an environment file.

Then, we have two methods that get the information, one for the employees and another one for the time offs from Bamboo. Some logic is applied in here to limit the information retrieved, for example, for the time offs we just want to retrieve the requests created or updated for the upcoming 3 months, anything prior to that is not needed for our target system.

The database services

From the BambooCron class, we also use services that connects to our database. In our system, we are using typeorm (which I talked previously here), an ORM with mysql integration and supports typescript out-of-the-box. For this post, I’m just going to show the service to manage users, however all of them follow a similar approach, so you can extrapolate for the rest of the entities.

import { BaseService } from "./base.service";
import { Employee } from "../../data-access/bamboo/employee";
import { User } from "../../data-access/entity/user";


export class UserService extends BaseService{
  public getAllUser = async () =>{
    return this.dbContext.users.find({
      where: { statusTxt: 'active' }
    });
  }

  public async saveUser(employee: Employee): Promise<User> {
    let newUser = this.createUser(employee);
    try {
      await this.dbContext.users.insert(newUser);
      return newUser;
    } catch (error) {
      throw error;
    }
  }

  public async removeUser(user: User) {
    try {
      user.statusTxt = <any>{ statusTxt: 'inactive' };
      await this.dbContext.users.save(user);
    } catch (error) {
      throw error;
    }
  }

  private createUser(employee: Employee): User {
    const userNm = employee.workEmail.substring(0, employee.workEmail.indexOf('@'));
    const user: User = this.dbContext.users.create({
      bambooId: employee.id,
      email: employee.workEmail,
      fullNm: employee.displayName,
      userNm: userNm,
      statusTxt: <any>{ statusTxt: 'active' }
    });
    return user;
  }
}

The User service is pretty straight-forward. It has some CRUD operations like getting active users, saving new users and finally removing them (soft delete by changing the status). It extends a BaseService class which looks like this:

import { DbContext } from "../../data-access/dbcontext";

export class BaseService{
  protected dbContext:DbContext = new DbContext();
}

This one is even easier, since it only exposes a property that is called DbContext. This property is exposed to every service that inherits from it, and basically it grants the ability to use connections from typeorm to execute queries or transactions with the database. Finally, this is how the DbContext class looks like:

import { Connection, createConnection, EntityManager, Repository } from "typeorm";
import { User } from "./entity/user";

export class DbContext {
    private connection: Connection;
    constructor (){
        this.init();
    }

    private async init(){
        try {
            this.connection = await createConnection({
                "name": `connection-${new Date().getTime()}`,
                "type": "mysql",
                "host": ANY_HOST_HERE,
                "port": 3306,
                "username": ANY_USERNAME_HERE,
                "password": ANY_PASSWORD_HERE,
                "database": ANY_DATABASE_HERE,
                "synchronize": false,
                "logging": true,
                "entities": [
                    User
                ]
            });
        } catch (error) {
            throw error;
        }
    }
    
    public get manager() : EntityManager {
        return this.connection.manager;
    }

    public get users(): Repository<User>{
        return this.manager.getRepository(User);
    }
}

The DbContext class is a reduced version of the one I use, it has more entities but the rest of the design is the same. First, we have an init method that creates a connection every time the DbContext is instantiated, and this connection receives all the entities and database information needed to create it.

And then, for every entity, we expose a getter property that expose the repository for each one of the entities that the typeorm will map to.

Finally, where do we execute all of this code. Since it needs to be executed or started as soon as the Node service starts, we add the code to the index.ts file of express.js, like this:

...IMPORTS AND OTHER STUFF HERE

const cron = new BambooCron();
...SOME LOGIC HERE TO PREPARE THE SERVICE OR OTHER THINGS
cron.startJob();

const port = parseInt(process.env.PORT);
export default new Server()
  .router(routes)
  .listen(port);

Summary

Finally, we have arrived to the end, and if you are here also, it means that you have created all the necessary code to run a scheduled task using node-cron and typeorm. Now, this is just one of the many use cases that can be covered with this design, so please adapt it as best as you see fit to whatever case you have to solve.

If you have any comment, don't hesitate in contacting me or leaving a comment below. And remember to follow me on twitter to get updated on every new post.

Node.js and ORMs? TypeORM at your service

As I’ve mentioned in other posts, I’m working with more Node.js projects than ever and my experience with .NET applications is reducing (not complaining, if you think that). However, most of those projects have been with no-sql databases like DynamoDB and I haven’t had the need to use any ORM for it, even though there are options.

Recently, I was assigned a project that needed a big rescue. It’s an internal application of the company, rewritten many times but none of those times it has been completed, and the design itself wasn’t as good as you would like it to be. My assignment, design the architecture, define the technologies to be used, estimate it and assign tasks to a group of developers to work on it. So far, so good.

I decided to use React in the front end, and one main REST service in ExpressJS. Both solutions are going to be written in typescript. But what about the database? well, mysql was my choice and if you ask me the reason, it’s because the business logic makes more sense in a relational environment, but also because I want that my team (and myself) use an ORM to connect to a relational database like MySQL.

A quick search in Google will give you many results about ORMs but I found an interesting article that compares many of them and gives them a rank, take a look at it here. The only comment I would make there is that mongoose is limited to only one database, whereas TypeORM support multiple ones so in my mind, they should switch positions.

Ok, enough chit-chat, let’s start with the tech-y comments. First, I’m starting with a simple definition on an ORM.

What’s an ORM?

ORM stands for Object Relational Mapping, and it’s a mechanism that enables developers to manipulate data directly from the source without the hassle that it normally would take. They map the data sources to objects in code that can be queried, and the ORM transforms the actions over those objects to the specific commands in the specific data source. In other words, they abstract the data access layer from the developers and serves a “virtual object database” to be used within the programming language.

There are many ORM tools in the community, here are some of the most famous ones per programming language:

  1. Hibernate -> Java

  2. Entity Framework -> C#

  3. CakePHP -> PHP

  4. Django -> Python

  5. ActiveRecord -> Ruby

What’s TypeORM?

As the name suggests, and we have mentioned many times over the post, it is an ORM that runs in NodeJS; however it supports other environments like PhoneGap, React Native, Nativescript, etc.

It’s built to be used with Typescript or the latest versions of Javascript (from 5 to 8). Currently, version number is 0.2.9, but do not get fooled by this, it has over 3000 commits in Github! more than 40 thousand downloads per week! and finally, but not least, over 9000 stars!

The first version was deployed in December 6th of 2016, and had 36 releases since then. Being a young tool, it has been influenced by other ones like Hibernate and Entity Framework, so if you noticed stuff that feel familiar, it’s because they are.

From their website, here are some of the main features it provides:

  1. Both DataMapper and ActiveRecord

  2. Eager and Lazy relations

  3. Multiple inheritance patterns

  4. Transactions

  5. Cross-database queries

  6. Query caching

  7. Support for 8 different databases

  8. And many more here

Is there a model generator?

If you have worked with Entity Framework, you can do reverse engineering to create all the POCO classes from an existing database. Well, for TypeORM there is something similar.

Konnonable’s typeorm-model-generator package solves all of this for you. It can create all the object classes that you need to use in your application and it supports 6 databases, leaving Mongo and sql.js outside of the equation.

This package is even younger than typeorm, its first release was in July 2017 and it had 24 releases since then. It’s far less known in the community since there are not many downloads per week according to npmjs (around 300 per week).

Still, this package works like a charm and the configuration is as simple as you can imagine. Take a look at the next line:

typeorm-model-generator -e mysql -h [HOSTNAME] -d [DATABASE] -u [USER] -x [PASSWORD] -p 3306 --noConfig -o . --cf camel --ce pascal --cp camel --lazy

From it you can specify all the necessary parameters to establish a connection to a database, but also some configurations regarding the classes generated like the naming conventions and the “lazyness” of the relationships between them. If you want to take a look at all the available options for configuration, click here to view them.

Using TypeORM

After creating the classes with the typeorm-model-generator package, I ended up having classes that look something like this one:

import { Index, Entity, Column, OneToOne, OneToMany, ManyToOne, JoinColumn } from "typeorm";
import { UserStatus } from "./userStatus";
import { ProjectMember } from "./projectMember";
import { StatusReport } from "./statusReport";
import { TimeOff } from "./timeOff";

@Entity("User", { schema: "statusone" })
@Index("StatusTxt", ["statusTxt",])
export class User {
    @Column("varchar", {
        nullable: false,
        primary: true,
        length: 50,
        name: "UserNm"
    })
    userNm: string;

    @Column("varchar", {
        nullable: false,
        length: 150,
        name: "Email"
    })
    email: string;

    @Column("varchar", {
        nullable: false,
        length: 150,
        name: "FullNm"
    })
    fullNm: string;

    @ManyToOne(() => UserStatus, UserStatus => UserStatus.users, { nullable: false, onDelete: 'RESTRICT', onUpdate: 'RESTRICT' })
    @JoinColumn({ name: 'StatusTxt' })
    userStatus: Promise<UserStatus | null>;

    @OneToOne(() => ProjectMember, ProjectMember => ProjectMember.userNm, { onDelete: 'RESTRICT', onUpdate: 'RESTRICT' })
    projectMember: Promise<ProjectMember | null>;

    @OneToMany(() => StatusReport, StatusReport => StatusReport.userNm, { onDelete: 'RESTRICT', onUpdate: 'RESTRICT' })
    statusReports: Promise<StatusReport[]>;

    @OneToMany(() => TimeOff, TimeOff => TimeOff.userNm, { onDelete: 'RESTRICT', onUpdate: 'RESTRICT' })
    timeOffs: Promise<TimeOff[]>;
}

This one is an easy example of a User table in our database with its relationships like the status of the user, all the reports and more.

TypeORM uses heavily decorators, which requires some options to be enabled in the tsconfig file, however don’t worry about them since it’s explained in their installation instructions here. But if you want the easy route, here is mine.

{
  "compileOnSave": false,
  "compilerOptions": {
    "target": "es6",
    "module": "commonjs",
    "esModuleInterop": true,
    "sourceMap": true,
    "moduleResolution": "node",
    "outDir": "dist",
    "typeRoots": ["node_modules/@types"],
    "experimentalDecorators": true,
    "emitDecoratorMetadata": true
  },
  "include": ["typings.d.ts", "server/**/*.ts", "src/**/*.ts"],
  "exclude": ["node_modules"]
}

Ok, so our project have the classes from the database, and the typescript compiler recognizes all the decorators that TypeORM uses, so how do I use it??

I’m not going to expose the architecture I’m planning on using in the application, mostly because I haven’t completed it. But here is an example of a basic query I’ve done with TypeORM.

import {createConnection} from "typeorm";
import {User} from "../data-access/entity/User";

createConnection().then(async connection => {
      try {
        const users = await connection.manager.find(User);
        console.log("Loaded users: ", users); 
      } catch (error) {
        console.log(error);
      }
    }).catch(error => console.log(error));

Just as easy as it looks, you can obtain all the users from the database by creating a connection, and then using the connection manager to find all the objects from the class passed as parameter.

Another more complex query can be the following:

import {createConnection} from "typeorm";
import {User} from "../data-access/entity/User";

createConnection().then(async connection => {
      try {
        let projectMembers = await connection
            .getRepository(User)
            .createQueryBuilder("user")
            .innerJoinAndSelect("user.projectMember", "projectMember")
            .getMany();
        console.log("Loaded members: ", projectMembers); 
      } catch (error) {
        console.log(error);
      }
    }).catch(error => console.log(error));

In this example, we are creating a query that retrieves the relationship to ProjectMember for the users using a syntax called QueryBuilder which reminds me a lot of EntityFramework.

As far as query examples goes, I will stop here and suggest you to read more documentation like here or here.

Is there anything bad?

Nothing is perfect in this world, and I didn’t have to investigate a lot to find something that surprised me. Believe it or not, but apparently TypeORM still doesn’t support bit types for mysql (Check line 81 in this file).

Even though that this shocked me at first, I had the possibility of changing the design to overlook those bit types and use something different, however it’s still something that annoys me and they should add that support soon.

Summary

TypeORM is one more tool to our toolkit, one that will make your life a lot easier and have the support of a big community. It allows you to abstract a lot of the complex code that can come from connecting to a database, and helps you focus on your business instead of figuring out how to do a select in a table.

If you have any comment, don't hesitate in contacting me or leaving a comment below. And remember to follow me on twitter to get updated on every new post.