line

Scrape & Analyze Restaurant Reviews

This simple ETL Workflow scrapes all of the Tripadvisor reviews for a specific restaurant for the current month and then runs a sentiment analysis on each review to determine the positive or negative aspects. Then, the results are saved to a Google sheet where a report is generated.

Visual of Workflow

This flowchart helps us to visualize the tasks and logic of the workflow.
line

Workflow Steps

  • First, it gets the current month
  • Then we pull all of the tripadvisor urls for that restaurant for that month
  • Then a sentiment analysis script is launched for each review (in parallel) to determine if it is positive or negative
  • Finally, the data is saved to a google sheet

Workflow Code

This workflow is the code that orchestrates tasks (through the Zenaton workflow engine) and executes them on your servers. Tasks inside the workflow are not detailed here.

const { workflow } = require("zenaton");

module.exports = workflow("ReviewAnalysisWorkflow", {
    *handle(restaurantName, tripAdvisorUrl) {
        const google_sheets = this.connector('google_sheets', 'your-connector-id');

        const currentMonth = yield this.run.task("GetCurrentMonth");

        const reviews = yield this.run.task("ScrapeReviewsOfTheMonth",
            tripAdvisorUrl,
            currentMonth
        );

        tasks = reviews.map((review) => ["SentimentAnalysis", review]);
        const postiveAndNegativeAspects = yield this.run.task(tasks) // run in parallel

        const cells = this.format_cells(restaurantName, currentMonth, postiveAndNegativeAspects);
        yield google_sheets.post(`v4/spreadsheets/${spreadsheetId}:batchUpdate`, cells)
    },
    format_cells(restaurantName, currentMonth, postiveAndNegativeAspects) {
        // ...
    }
});

Schedule Scrape & Analyze Workflow

We can easily make this workflow recurrent by using the schedule method. For instance, the one-line code below schedules the workflow every month.

const { Client } = require("zenaton");
const client = new Client(app_id, api_token, app_env);

client.schedule("0 0 1 * *").workflow("RecurrentWorkflow");

You will have access on the Zenaton dashboard to all workflows scheduled:

cron-review-analysis

Workflow Executions

View a short snippet of the task executions from the dashboard.
line