For the past couple of years, we have been using require.js for module loading and Grunt for automating tasks on front-end, for one out of many projects we have in Wingify. The project has a huge code-base and has many independent components inside it with some shared utilities. Also, there was no concrete build system which could be scaled upon adding new components.
Require.js was being used for good code-structuring, managing modules and their loading. All the different modules were having their own require-config.js file to define rules for a particular module.
Grunt was being used for automating different tasks required to speed up mundane work. We had a number of tasks like the require-amdclean task, concatenating different script / CSS files, minification of files, cache-busting mechanism and so on.
Following are some benefits we were getting from the require-amdclean task:
We didn’t have to include require.js in production, thus, saving some bytes.
Get rid of file size/source code readability concerns.
Everything was working as expected but maintenance, performance, and scale were the issues. We had so many healthy discussions regarding improving things and thus we thought of upgrading our tech stack too. Also, as I mentioned we didn’t have a concrete build system; it was the right time to investigate further. We were ready to spend some quality time in researching technologies which could fit in our build system. Gaurav Nanda and I took a break from our daily chores and read many articles/blogs and the not-so-useful official docs to get a good command over various technologies. Migrating from Grunt to Gulp wasn’t helping us since build time was nearly the same. The task which took a lot of time was the require-amdclean task, taking around 10 seconds even for adding just a single character like ; while working in the development environment.
Migrating from NPM to Yarn - First step towards a new journey
After reading about Yarn, the team was really curious to play with this yet new package manager aka dependency manager. When we benchmarked the results, we were literally stunned by the time difference between NPM and Yarn in fetching up resources. Yarn achieves this speed by introducing parallelism and its performance and security via maintaining a yarn.lock file.
For a total of 34 packages in total, the following stats would please your eyes too :)
Running the commands with already installed packages
yarn(with yarn.lock file)
Yarn offers a lot more besides its fast speed, security, and reliability. Check these commands Yarn offers.
Since we were using bower too, our first step was to port all the dependencies and dev-dependencies listed in our bower.json file to package.json. This was a time-consuming task since we had a huge list of packages. After successful porting of packages and validating the version numbers with the previous packages, we were all set to switch to Yarn. This also helped in keeping just one file for managing packages. We are no longer using bower. Even bower’s official site recommends using Yarn and Webpack :)
Why switch to Webpack 2
It wasn’t an easy task to accomplish since Webpack is a module bundler rather than a task runner. We were so accustomed to using task runners along with the old-fashioned require.js based module management that it took a good amount of time figuring out how to proceed with our mini-app’s new build system.
Apart from the numerous benefits of using Webpack, the most notable features, especially for our codebase and the build system, were:
Easy integration with npm/yarn and seamless handling of multiple module formats. We now use two of its kind, one is UMD and the other one is this target option (we have such a requirement).
Single main entry and one single bundled output - exactly what we needed.
Cache busting(hashing) - Very very easy to implement and get benefitted.
Building different, independent, and standalone modules simultaneously. Thanks to parallel-webpack!
Using webpack-loaders -
babel-loader - so that we could start writing ES6 compatible code even with our require.js module management system.
Converting to Webpack 2 - A transcendent journey ahead
In the beginning, it looked like just porting the require.js configuration to Webpack and we’re done. A big NO! This thought was absolutely wrong. There were so many scenarios we had to deal with. We will discuss this in detail as we move along.
First thing first, a clear understanding of what exactly Webpack is and how does it bundle the modules are must. Simply copy-pasting the configuration file from the official website and tweaking it won’t help in a long run. One must be very clear regarding the fundamentals on which Webpack is built upon.
Problems which we needed to tackle were:
Different modules in the same app, having different configuration files.
Webpack config should be modular in itself and be able to run multiple configs at once so that we should be able to add/remove a new module easily without affecting any existing one.
Since we needed to support different modules we had to have different config files for each of our module.
The above configuration is capable of handling n number of modules. Different modules will have at least one bundled JS file as the output. But we also needed to have a bundled CSS file corresponding to each module. So, we decided to have two different config files for every module which has both JS and CSS bundling, one for bundling JS and other for managing assets and bundling CSS files. Tasks like copying files from src to dist, updating the JS file name with a cache-busting hash(prod build) in the index.html file and so on were taken care of inside the assets config file.
The above-mentioned break-down of a module into JS and CSS bundling helped us in having a clean, modular, and scalable approach for our new build system.
We also used parallel-webpack to speed up our build by running independent modules in parallel. But be very careful using it, since it spawns a new thread for each different task, which basically uses the different cores of a machine to process. Also, there should be a cap on the number of parallel-tasks to prevent overshooting of CPU usage.
Extraction of common stuff for reusability and maintainability
Let’s discuss Webpack module-rules and resolve-aliases which play a significant role, before advancing further with the creation of common webpack-configuration helper methods.
module rules - Create aliases to import or require certain modules more easily. This basically tells how to read a module and to use it.
We used expose-loader and imports-loader depending on the use-case.
expose-loader - adds modules to the global object. This is useful for debugging or supporting libraries that depend on libraries in globals.
imports-loader - is useful for third-party modules that rely on global variables like $ or this being the window object. The imports loader can add the necessary require(‘whatever’) calls, so those modules work with Webpack.
This is an obvious thing that we had same third-party libraries, wrappers over external libraries, and self-baked useful utilities shared across different modules. This means that our module-specific webpack config file would have the same set of repeated rules and aliases. Code duplication might seem a good fit here for readability but is really painful to maintain in a long run.
Let’s discuss how we managed to share the common module rules and resolve aliases across the different modules.
Below is a generic utility file’s code which has two methods. One outputs whether a passed argument is an Object and the other one outputs whether it’s an array.
Here’s a list of common rules and aliases defined explicitly in a separate file.
We now had a common file where we could easily add/update/remove any rule and its corresponding alias. Now we needed to have a utility which combines the common rules and aliases with the already defined rules and aliases in a particular modules’ config file.
Time to write our module specific config file. We’ll demonstrate just one config file i.e. for moduleA and the others would look exactly the same except the options’ value as per module.
Here’s the full webpack config file for moduleA.
This is a complete webpack config file for bundling JS file for moduleA. While configuring it, we defined different options, each one has its own purpose. To know more about each option, please refer this.
We introduced two loaders for bundling JS resources inside our app.
Since we needed these two loaders for all our modules, we defined them in the same file we discussed earlier - rulesAndAliasUtil.js
And updating the method: mergeRulesAndUpdate as follows
This was all about bundling of JS modules. The same approach was followed for different modules. Now we were left with the bundling of our CSS files and the obvious chores like copying, replacing, etc.
Webpack Bundling of CSS files
The above configuration outputs two bundled CSS files i.e. css-file-1.min.css & css-file.min.css, and css-file-1-8fb1ed.min.css & css-file-2-6ed3c1.min.css if it’s a prod build.
We faced a very weird issue and thus worth mentioning here explicitly. ExtractTextPlugin tries to process URL like in background-image, url(), etc. We need to stop that behavior so we need to set url:false inside the options like:
Few more plugins that we are using are:
CleanWebpackPlugin - to remove/clean the styles folder inside the build folder before building
ManifestPlugin - for generating an asset manifest file with a mapping of all source file names to their corresponding output file
This plugin generates a JSON file so that the hash appended(prod build) after a JS file can be later read by another file. Eg. one CSS file is shared among different modules so its hash needs to be stored somewhere to be read later by other modules to update the hash in their corresponding index.html files.
CopyWebpackPlugin - to copy individual files or entire directories to the build directory
PurifyCSSPlugin - to remove unused selectors from the CSS. This plugin was a must for us. So, what we were doing in this entire project earlier was to copy-paste the Parent projects CSS file to this independent project. We followed the same approach because of time-constraints but found this amazing plugin which automatically removes the unused CSS from the bundled CSS files based on the paths of files which uses it. We can even whitelist selectors if classes are appended on run-time or for any other reason. But it is highly recommended to use the PurifyCSS plugin with the Extract Text plugin which we discussed above.
Last step - Automated scripts and provision to execute module-specific build
First, we created a file to read arguments that could be read in our webpack.config.js file via a package.json script.
We tweaked our main webpack.config.js to make it module-aware.
In our package.json file, we created different scripts for running either a development build or production-ready build(minification, cache-busting, and purification) and either to run build for all modules or for just selective modules.
According to Sean T. Larkin in the release blog post: “webpack 3: Official Release!!”, migrating from webpack 2 to 3 should involve no effort beyond running the upgrade commands in your terminal. We are using [email protected] and [email protected] now :)
Last but not the least - Stepping towards a long journey
This was just the beginning of stepping towards researching different technologies and upgrading our tech stack. We have now gradually started writing ES6 code for that particular project. The experience was tremendous and the team is now working on evaluating other sections where the change could gradually take a form.
“What is the most resilient parasite? Bacteria? A virus? An intestinal worm? An idea. Resilient… highly contagious. Once an idea has taken hold of the brain it’s almost impossible to eradicate. An idea that is fully formed - fully understood - that sticks; right in there somewhere.” – Cobb(Leonardo DiCaprio), Inception
What is DevFest?
On September 9th we had the first instance of our Wingify DevFest. It started with a simple idea, to have a community of fellow techies where everyone could meet, learn something new, share ideas and inspire one another. But we didn’t just want to end here. We wanted to have a day where people could celebrate and have a good time. Thus, the Wingify DevFest was born.
How did we plan for it?
Though the DevFest happened on 9th September, the preparations had started much before that. In fact, the whole structure of DevFest underwent drastic iterations since we’d first started working on it. Initially, we had simply planned on having a set of internal team members of Wingify as speakers. The rationale behind this was, this being our first DevFest having internal speakers we would help us have a good grasp of the speakers and their content. It would also be easier to organize because we could skip the overhead of finding external speakers. This idea was soon scrapped because we would have had to compromise the interest of our teammates as most of the internal talks had already been watched by the team. The other extreme plan was to have all external speakers, which too was soon ruled out because of the logistics involved. We also knew that some of our own internal speakers had good content which the world should definitely see. Finally, we agreed upon having an all external speakers list and keep the internal speakers as backup, should the need arise any time. And thank God we did, because as you’ll soon find out, we did have to use the fallback.
Amidst the initial confusion of finding the ideal number and type of speakers, there was still an extreme clarity within the organizing team about the other events that we wanted to have. More on that later.
Deciding on the theme
Organizing the first of a series always has its own set of challenges and uncertainties. For us, the main challenge, which was a crucial factor in almost all of our decision making, from the topic for the DevFest to even deciding what swag should we have, was identifying our target audience. Unlike some major tech cities like Bangalore, Hyderabad where the majority of folks are working professionals, Delhi has a beautiful eclectic mixture of working professionals and college students. In fact, the number of engineering colleges in Delhi are mind blowing. This translates to the fact that in most of the meetups and communities there’s a mixture of both the streams. Extrapolating from this fact, we concluded that we too could expect a mixture of both the classes. The challenge with that was to find a theme suitable enough to resonate with all the members. Performance, Reliability and Security was the perfect topic because everyone, at some point in the college/professional life, has had a requirement to know deeper about it. With a balanced set of talks on this theme, we could achieve a point which would keep both the parties interested.
With the topic of the DevFest clear, finding speakers was the next challenge, or so we thought. On 27th July we started campaigns on several social media channels, meetups and also word of mouth to find the best tech speakers in Delhi. It was a 15-day campaign and by the time it ended we were ecstatic. There were more than 20 entries and some even tried to register after the deadline. Not bad for the first time 🙂. After several meetings and discussions, we finally narrowed down to 3 final speakers. We had even sent them the invitation. Too easy, we thought. One week before the event 2 of our speakers backed out because of inevitable issues. There was a DEFCON 1 emergency declared in our nation! Everyone went on a rampage. Well, maybe I’m exaggerating a bit, it wasn’t DEFCON 1 because we didn’t have nuclear weapons, but you get the drift. In that frenzy we sought out the internal speakers. Things could’ve gone really south if we didn’t have an existing plan B. Though, we eventually ended up having four speakers instead of three because an earlier backed out speaker managed to join back and, so, we were more than happy to re-adjust the schedule. These were the speakers who finally spoke
At Wingify, we frequently have internal technical events that keep our wits sharp. Since one of the inherent idea of the DevFest was to keep it interactive for everyone, what better way than to include a few of these events in the schedule as well. Selecting the events was as easy as looking back at list of previous year’s events and adding the ones which were liked by majority of team members. The finalists were Code in the Dark and Capture the Flag.
The day before The Day we stayed back late in the office. The previous week had been taxing because of the whole speakers backing out fiasco and also because the organising team had been really busy releasing the new VWO Conversion Optimization Platform to general public! Thus, there were several logistics that had to be taken care of on the last day. Everyone went late that day yet returned back early the next morning.
September 9th, our spirits were high. No, we weren’t high (at least not until the events lasted), we were giddy. Everything was set. The initial slow pouring of the attendees soon gained pace and by 11 am our office was packed and ready for some action. It was a good mixture of energetic college folks and knowledgeable professionals, each trying to find like-minded counterparts to talk ideas. Thanks to Akash Tyagi, we had some really cool banners installed all over the place. In fact, right from the beginning he had been the guy who’d designed the banners, logos and social media cards etc., which everyone greatly admired.
Atul Agarwal had accepted our request to be the Keynote Speaker for the event. His talk on performance, reliability and security, was full of wisdom that he had garnered on his journey to make AdPushup a successful and formidable ad-revenue optimization company in its space. He went on about how most companies, in a haste to launch feature after feature, often forget the aspects of performance, reliability and security, which later bites them back. Sometimes overlooking such aspects costs companies a fortune and, even worse, respect of their clients.
Immediately succeeding him was Saurabh Shandilya, who spoke about the ToR network. His talk cleared some of the misconceptions that people have about ToR and through his articulate speech he managed to convince many people to try it out. Not only that, he even managed to convince some folks who’d already tried it earlier and given up, to give it another shot.
Next in line was Deepak Pathania. Although Deepak says that it was his first ever talk, we have our doubts. We’ve seen seasoned speakers get uncomfortable on the stage but Deepak didn’t break a sweat. He spoke about the Google Amp project and why it’s a viable optimization strategy for your mobile pages. He also went ahead and gave some example on how to quickly start a project with Google Amp.
With a quick lunch, after Deepak’s talk, it was Capture the Flag time! Dheeraj Joshi from the organising team, had managed to craft some mind-tickling questions for the participants to rack their brains on. For the next two hours everyone was glued to the event, trying to find ways to get to the hidden flags. At the end of the day, Capture the Flag turned out the be the star attraction of the DevFest.
Succeeding the CTF, was Neha Sharma who spoke about Web Apps and Performance. Neha is a tech speaker and founder of the renowned JSLovers Community and we were lucky to have her in the list of speakers. Given the breadth of her topic and the limited time she had for her talk, she could only give an abridged snippet of how developers can improve their website’s’ performance by using several best practices.
After Neha it was Manish Gill’s turn. Manish is a fellow Wingifighter who rose up to the challenge to speak at the DevFest when some external speakers had backed out. He works in the Data Layer team in Wingify, the team which manages the performance and scalability of data collection and retrieval aspect of our application. Having worked on challenging scalability problems and having experience in giving public talks, he was the ideal candidate to represent Wingify. Manish did deliver an insightful talk about how we’ve used Postgres and Kafka to scale to the tune of 20k requests per second.
We finally finished the day with Code in the Dark. It was a long long day, and we’re glad we chose to end with it. Our in-house DJ, Ashish Bardhan, played the best of the best Techno music that we could’ve asked for. The dark settings along with the laser lights and the music set the right ambience to get the adrenaline pumping. It was intense! By the time the Code in the Dark ended everyone was rejuvenated.
All that, in one day. Achievement level: 50,000.
How did we fare?
There were many things we did well, and there were many things we could’ve done better. Our sound system, definitely, frustrated some of the speakers and audience members. It malfunctioned multiple times and broke the flow of the speakers. We should’ve also provided a visual timer for the speakers so they could keep a track of their talk. It wasn’t the smoothest event, I agree but what doesn’t kill you makes you stronger. With these learnings we’ll be better prepared to have a smoother DevFest next time.
Some moments captured during the DevFest:
Our quest to have a community of like-minded people has just started. The first instance of the DevFest has been a stepping-stone for us and it’ll only get better from here. Stay tuned for the next DevFest. It’s going to be legen….. wait for it!
Shipping a bug-free feature is always important in every release. To ensure this, we do quality analysis(QA) at various points of the feature cycle. To facilitate an efficient QA, we also maintain certain environments for our app, each serving a different purpose. We have the following environments to be specific:
Production - The actual live app.
Staging - A replica of the production where final sign-off QA is done just before going live.
Test - A quick deployable environment which can be used by developers to share the WIP feature branch with anyone in the company or among other developers.
With multiple features in development simultaneously and multiple environments to deploy, automated deployment becomes very important to ensure frictionless and fast feature lifecycle. In this post, I’ll try to explain how to manage all these environment deployments through automation, especially for our product VWO.
As mentioned above, tests are very lightweight environments which developers generally create to share their WIP feature branch with other developers, QA or someone from marketing/product to gather feedback. Our app consists of various components: frontend, main-backend and various other micro-services. So each test environment is a combination of different branches from each of the constituent components. For example our app have following components: frontend, backend and Service-1. So our tests can look like:
And as these tests should have a unique sharable URL, they can be given names like: feat1.vwo.com or heatmap-optimizations.vwo.com
To deploy such a test we have a job on Jenkins. As you may have guessed already, the inputs to this job are:
Name of the test instance
Once this job runs, it pulls on all the above 3 branches on a remote server, does some configuration changes and creates a virtual host to work on testname.vwo.com.
Now, even this job would require the developer to open Jenkins webapp, go to job page, put in inputs and then run it. But we avoid that too - enter Ramukaka! Ramukaka is our Skype bot (that we have open-sourced as well) which we use for various grunt tasks, such as running a Jenkins job!
With Ramukaka in the picture, our test deployment looks like so:
Note: We have 3 components and have only 2 branches are specified. That is because the developer can skip a component if the branch to be deployed is default i.e. master. Also, the same command just pulls the latest changes in case the test instance already exists.
Staging has primarily 2 differences from test:
There is a single staging unlike multiple tests.
There are some more build steps involved compared to a test.
So it’s similar to a test deployment, except that before deploying it required the developer to build his/her branch like so:
Note: While building a branch we also inform the job about the environment to build for (eg. stagingapp above) because right now the code needs to be a bit tweaked according to the domain its deployed on.
And once Ramukaka confirms a successful build, the developer can deploy the staging with that branch:
Some more commands
As I had mentioned, we have just one staging (single gateway to production). Therefore, each deployment overwrites the previous deployment. And so it becomes important that developers do not overwrite each other’s deployment by mistake. To prevent this, we have an additional command in Ramukaka called currentBranch. Through this command anyone can check which branch is deployed for a particular component on the staging. Eg. if I need to check the frontend branch on staging, I would do so:
Now the developer can take appropriate actions based on the deployed branch.
The production is no different from the staging. Once the final round of testing is done by the QA team on staging, there are 3 things that need to be done to deploy the app on production:
Build the branch
Create a tag for release on master branch
Deploy the tag on the server
All the 3 tasks are handled through a single command on Ramukaka:
And the frontend gets deployed on production, just like that!
Note: Right now only the frontend deployment is automated for production. But we plan to do it for all the components of the app.
All this deployment automation saves us a huge amount of time. And we know we can save more. Using similar automation for every component of the app is something we plan to do next. Also better logging and monitoring of these environments is on the list.
How do you manage multiple environments? We would love to hear about your deployment techniques if you want to share in the comments.
I recently got an opportunity to speak at the PyData, Delhi. PyData is a tech group, with chapters in New Delhi and other regions, where Python enthusiasts share their ideas and projects related to Data Analysis and Machine Learning.
Talks at PyData
There were three talks at PyData, namely Machine Learning using Tensor Flow, Data Layer at Wingify and mine, Learning Data Analysis by Scraping Websites. All the talks were thorough and excellent! In the talk, Data layer at Wingify by Manish Gill 🤓, he talked about how we handle millions of requests at Wingify.
Some of Images of the PyData Meetup Hosted by Wingify.
Background About My Talk
Let me give you a little background. It was the Friday before the PyData Meetup/Conference. Our engineering team was doing its daily tasks. I had just grabbed a coffee to alleviate my laziness. Suddenly, our engineering lead came and asked us whether anyone could present on a topic at the PyData that we were to organise the very next day. An initial speaker, who had confirmed earlier, backed out at the last moment because he had fallen sick. I could see that most of the team members tried to avoid volunteering in such a short notice and also probably because the next day was a Saturday (though this is my personal opinion). But I had something different on my mind and during this planning or confusion, I volunteered for it 🤓. I had a project that I had done, back when I was learning Python. So I offered to present it. He agreed to it and asked me to keep the presentation ready.
Preparing the Project & Slides
That Friday night, I started searching for the old files which I had used. Finally, I found all of them on my website, downloaded them and ran the code. It worked like a charm 😍. Yeah! I quickly created the slides around it, and after finishing, smiled and went to sleep at 4.30 am.
Little About the Basics of My Talk.
The presentation that I gave was on Learning Data Analysis by Scraping Websites. During my college days, we heavily used the BeautifulSoup Library in Python to scrape websites for the many personal projects. During this project, I got the idea to scrape data from the websites which aggregated movies related data. By doing that, I thought that I could create a list of all movies that I must definitely watch. The movies had to satisfy the following criteria:
Release date >= 2000
Rating > 8
It was not the best idea at that time to scrape websites and then analyse(Data frame). But I learned a lot of things by scraping data from the website using Beautifulsoup, then analyzing data using Pandas, visualizing data using MatplotLib (a Python library) and finally coming to conclusion about my movies recommendation.
Coming back to the objective - Finding and sorting the movies released between 2000-2017 in the order of relevance (I didn’t want to watch movies < 2000).
Below is the code to scrape IMDB for movies data from 2000-2017.
Click here to have a look at the full source code.
You can see the trends like Maximum Rating - Sorted by Rating, Year Vs Rating Trend
DataFrame - Rating is Set as Index
Maximum Rating - Sorted by Rating
Year Vs Rating Trend
Take away from the Talk
With this method, you would have winner’s data from the data set. For example, suppose you want to create a Cricket Team(IPLT20) which has the maximum probability to win the match, what you can do is parse the IPLT20) website for last 5 years’ data and select the top 5 batsmen and 6 bowlers 😎.
I totally understand that this may not be the best project for the data analysis. I am still learning and I showed what I had done. I believe that it served my purpose.
I will be doing more research on data analysis in Python. Thanks for reading this.
Below is my talk slides:
This article will deal with the issues we face with the current API architecture (mostly REST) and why demand-driven APIs seem a perfect replacement for it. We will also talk in brief, about GraphQL and how it is a feasible solution for implementing demand-driven applications.
Let’s take a simple example of author & articles. If we are given a requirement to develop an API to fetch authors or articles, it will most probably go like this, if we follow REST:
Let’s taken an example where we have to show an article snippet on my website’s dashboard. We would need its title, description & author name. So we hit the latter end point and it will give a response like:
There are two problems with this response:
1) Extra information: We only needed the title & description but we got everything related to the article and we cannot get rid of this extra payload as this extra information might be getting consumed at some other page i.e. Edit Article Page.
2) Missing information: We were expecting author name but instead we got authorId. This is bad and to solve this we would probably be making another network call on the former end point to get the author name. It’s an overhead making 2 network calls just to fetch 3 parameters, don’t you think? Also, it will just get more complex as we include more resources i.e. comments, images etc.
How Demand-driven Applications Work?
Now that we understand few issues with REST based APIs, we need a smart system which can give me the exact information required instead of giving me partial/extra information.This can be solved if the client demands what it actually needs and server gives it only that piece of information. This can be done using GraphQL.
Let’s try to solve our problem using GraphQL. The exact information that our client need can be represented in GraphQL as:
The server can have a single end point with the following schema:
And each field in our schema can have a function to fetch that piece of information. In our case:
On querying the data, we get i.e.
We will get like this:
This is what we needed, now we can keep the endpoint same and tweak with fields required to display relevant information at any page of our website.
Advantages of Demand-driven APIs
1) Single end point for serving any piece of information.
2) Less payload of data as no extra information is served.
3) Versioning of APIs become simpler as we can control the exact information required.
Disadvantages of Demand-driven APIs
1) Latency may increase due to a single end point handling all the querying of data.
2) No lazy loading possible as it’s a single call which will contain all the data.
Try it Out
If you think GraphQL is promising go ahead and try it out. There is much more to it that you will love to learn. Check out its official documentation. It has been implemented in all the well known languages and you can find it all here.