Building a quick and dirty data collection app with React, Google Sheets and AWS S3

Covid-19 has created a number of challenges for the society that people are trying to solve with the tools they have. One such challenge was to create an app for data collection from volunteers for food supply requirements for their communities.

This needed a form with the following inputs:

  1. Some text inputs like the volunteer’s name, his vehicle number, address of delivery..etc.,
  2. The location in geographic coordinates so that the delivery person can launch google maps and drive to the place
  3. A couple of photos of the closest landmark and the building of delivery.

Multiple ready made solutions like Google Forms, Zoho Forms were attempted, but we hit a block when it came to having a map widget which would let the location to be picked manually, and uploading photos. After an insightful experience with CovidCrowd, we were no longer in a mood to build a CRUD app with Database, servers..etc., So the hunt for low to zero maintenance solution resulted in a collection of pieces that work together like an app.

Piece 1: Google Script Triggers

Someone has successfully converted a Google Sheet into a writable database (sort of) with some Google Script magic. This allows any form to be submitted to the Google Sheet and the information would be stored in the columns like in a Database. This solved two issues, no need to have a database or a back-end interface to access the data.

Piece 2: AWS S3 Uploads from Browser

The AWS JavaScript SDK allows direct upload of files into buckets from the browser using the Congnito Credentials and Pool ID. Now we can upload the images to the S3 bucket and send the URLs of the images to the Google Sheet.

Piece 3: HTML 5 Geolocation API and Leaflet JS

Almost 100% of this data collection is going to happen via a mobile phone, to we have a high chance of getting the location directly from the browser using the browser’s native Geolocation API. In a scenario where the device location is not available or user has denied location access, A LeafletJS widget is embedded in the form with a marker which the user can move to the right location manually. This is also sent to the Google Sheets as a Google Maps URL with the Lat-Long.

Piece 4: Tying it all together – React

All of this was tied together into a React app using React hook form with data validation and custom logic which orchestras the location, file upload ..etc., When the app it built it results in a index.html file and a bunch of static CSS and JS files which can be hosted freely as Github Pages or in an existing server as a subdirectory. Maybe even server over a CDN gzipped files, because there is nothing to be done on the server side.

We even added things like image preview in the form so the user can see the photos he is uploading on the form.

resource_form

Architecture Diagram

resource_form_architecture

Caveats

  1. Google Script Trigger Limits – There is a limit to how many times the Google Script can be triggered
  2. AWS Pool ID exposed – The Pool ID of with write capabilities is exposed to the world. If there is someone smart enough and your S3 bucket could become their free storage or if you have enabled DELETE access, then lose your data as well.
  3. DDOS and Spam – There are also other considerations like Spamming by watching the Google Script trigger or DDOS by triggering with random requests to the Google Script URL that you exhaust the limits.

All of these are overlooked for now as the volunteers involved are trusted and the URL is not publicly shared. Moreover the entire lifetime of this app might be just a couple of weeks. For now this zero maintenance architecture allows us to collect custom data the we want.

Conclusion

Building this solution showed me how problems can be solved without having to write a CRUD app with a admin dashboard every time. Sometimes a Google Sheet might be all that we need.

Source Code: https://github.com/tecoholic/ResourceForm

PS Do you know Covid19India.org is just a single Google Sheet and a collection of static files on Github Pages? It servers 150,000 to 300,000 visitors at any given time.

NiftyBot

The Mastodon ecosystem is really nice. The concept of Fediverse bringing in decentralized social network is a breath of fresh air. I created NiftyBot account in botsin.space – a dedicated server for Mastodon Bots.

What is NiftyBot?

  • It is a Mastodon Bot Account

What does it do?

  • It posts updates about Indian Markets
  • Currently it posts NSE closing report at 4.01 PM everyday. Sample post below

niftybot-sample

How does it work?

It is a Python script running in AWS Lambda.

lambda-niftybot

A scheduler tiggers the Lambda Function at 4.01 every Monday – Friday. The lambda function is a Python Script that fetches the necessary details from NSE’s output data and posts to Mastodon.

Source Code:

Some asked about if this bot is open source. Obviously, you see the source right here. 🙂 Still I will add the license here.

The above source code is released into the Public Domain. You can do what ever you want with it.

How much does it cost to run this Bot?

Nothing.

Numbers Please:

The AWS Lambda Free tier comes with 1 Million requests and 400,000 GBSec, which is a combination of how much memory we use and the time taken by our process. Since I have used the CloudWatch Scheduler Event as the trigger, I am using 20-22 requests, the Python function takes about 60 MB to run so running at the lowest memory of 128MB block, and usually completes in around 2600-2700 msec. The metrics says my worst billed event so far is about 0.3375 GBSec. With about 20-22 trading days in a month, I might use a total of 8-10 GBSeconds, leaving enough room to run many more bots like this one 🙂

Districts of Tamil Nadu – Hexagonal Maps

Hexagonal maps are useful for creating data visualisations of data points that are a representation of quantities that require equal sized polygons. For example Election Maps. While geographic map of the assembly and parliamentary constituencies might be used for visualising election results, it is a false one. The constituencies are created based on the number of people in a region under the principal of a representative for every X number of people. In such a scenario, using a geographic representation for a place like Chennai, which is made of 3 parliamentary constituencies, doesn’t give us the same perception as the 3 constituencies Kanniyakumari, Thoothukudi, and Thirunelveli put together. But in reality that’s what it actually means.

hex_comparison

  • Geographic Representation – Skewed representation of the base data. Unequal real estate for equal (approx) number of people
  • Hexagonal Representation – Correct representation of the base data. Equal real estate for equal number of people.

Now that we have the Parliamentary constituencies in Hex Map form, why not have the Districts as well.

TN_hexTN_2019

If you need the base data to create your own maps. The GeoJSON files for the same are available here https://github.com/tecoholic/Geographic-Data

Map of PM Modi’s Domestic Visits

PM Modi visited Tamil Nadu on 27th January 2019 for the AIIMS Hospital ground breaking ceremony. Twitter was trending with #GoBackModi and #TNWelcomesModi and I was curious about the number of times has PM Modi visited Tamil Nadu before.

The PM India site has a neat list of all the visits http://www.pmindia.gov.in/en/pm-visits/?visittype=domestic_visit

So, I created a map out of it.

Visits_by_PM_Modi.png

Update:

This map was replaced after some errors were discovered in the base data.

Literacy Gap of SC community in TN districts

I was going through the Census 2011 data once again and Erode district’s low Schedule Caste (SC) literacy rate caught my eyes. It is not a very lagging state when in overall literacy. But its SC literacy was less than the least literate district of Dharmapuri. So I added the data to the TN Districts shapefile and visualised it to see how lagging are the SC community across the districts.

Here are the maps

tn_overall_literacy_2011tn_sc_literacy_2011general_sc_literacy_gap

Correction

In an earlier map, the gap of Thoothukudi was mentioned as -14%, while the actual gap is around 6% due to a typo during the data processing. The map has been updated to reflect the change.

My observations

  1. Kongu Belt (Coimbatore, Tiruppur, Erode) is the worst. ~~The Gounder (land owning) community has ensured their position and the social ladder and ensured the peasantry remained uneducated and illiterate.~~

Update: While there might be an element of truth to it, the maps alone are not indicative of the inference. I have made the above observation based on the number of issue that have appeared on the media like the Mid-day meal staff harassment, two tumbler system etc.,

  1. Dharmapuri is a peculiar case, it has the lowest overall literacy in TN, but it is also the only district where SC community is more literate than the general population.
  2. Kanniyakumari which tops the overall literacy rates also tops the SC literacy. In fact the SC community of Kanniyakumari is more literate than the general population of almost all other districts. I think it would be an interesting place of humanities research in the area of literacy, education and caste.

Data Source:

http://www.tn.gov.in/deptst/areaandpopulation.pdf

India Literacy Map with a How-To

I published the Tamilnadu district wise literacy map some days ago and @tshrinivasan asked if I can write a blog on how to do it, and here it is now.

What are we going to do?

We are going to create India’s State Wise Literacy Map. It will be a Choropleth map ℹ️ just like the Tamilnadu one.

Things we need

  1. QGIS – An Open Source software that will be used to process the geographic data and create the map. Download and install it from https://qgis.org/ for your operating system.
  2. Base map – The digital map of India with its state boundaries as a shapefile. ℹ️ You search the internet for “India states shapefile”, there are a number of sources where you can find this. I am going to use the one from the Hindustan Times public repository. [shapefiles/india/state_ut/india_2000-2014_state.zip] ⬇️Download, Unzip the file and keep it ready. I am choosing the pre-Telangana map because the literacy data is from 2011 which is pre-Telangana.
  3. Data on literacy levels of the Indian states. An internet search for “India states literacy csv” would give a number of results. I am going to use the one from the Census 2011 website. ⬇️Download

Get the data ready

We have 2 sources of data:

  1. Geographic data which we downloaded from the Hindustan Times
  2. The Literacy data from the Census 2011 website

Both the datasets need to be joined to create the map. Let us do that:

  1. Open QGIS and create a new project. From menu select Project -> New Project
  2. Add the map using Layer -> Add Layer -> Add Vector Layer. Browse to the location of the downloaded shapefile, select the india_2000-2014_state.shp file and click Add. Add_layer
  3. You will be asked to select the coordinate system. Select WGS84 and click OK. Once the layer is added close the Add layer button. Select CRS
  4. Now you should have the map loaded to the main area, and should see the legend entry for the data layer like this. Base Map added
  5. Now right click on the layer and select Open Attribute TableOpen Attribute Table.png
  6. You will notice it has only two columns – the id and the state name. We are going to create a new column and add the literacy rates from the census data. In the Attribute Table, click the yellow pencil icon (first one in the icon bar) to start editing.
  7. Click the Add Column button and add the literacy column with type decimal. Add_columnliteracy column
  8. Now enter the literacy rates from the excel sheet into the newly added column. Sidenote: There is an automated way to combine the data without having to manually enter the data if you have the data in a delimited text file like CSV. It involves adding a something called a Data Layer. We will take the manual route to keep it simple.
  9. Once you have added the literacy values. Click Save Edits icon (Ctrl+s).  Now click the “Yellow Pencil” button again to stop editing. This is very important. Otherwise, you might unknowingly click at some place and change the geometry of the state boundaries.
  10. Now you should have the data in the attribute table like this. Attribute Table with Literacy.png
  11. Close the Attribute Table.

Styling the map

  1. In the Layers sidebar right click on the map layer and select Properties.
  2. In the Properties window, select Symbology from the side menu. Layer Properties.png
  3. In the Styling window make the following changes. Styling.png
    1. A – Change the style from “Single Symbol” to Graduated
    2. B – Select “literacy” as the column
    3. C – Set Precision to your liking (it denotes the decimal points of the values to be shown in the map legend). I prefer 0 or 1 usually.
    4. D – Choose a Color Ramp to your liking. I am choosing the one suitable for Wikipedia based on the Wikipedia Conventions.
    5. E – Set the mode to “Pretty Breaks”. Now as soon as you select this, the “Classes” tab right above it should be populated automatically. If not, use F.
    6. F – If your classes didn’t appear automatically, click the “Classify” button.
  4. Once you are satisfied with the Legend precision and the color ramp, click OK to see your styled Choropleth map. styled map.png

Note: The properties dialog provides a huge number of options to do a number of things including labels. Refer to a QGIS manual or tutorials on the web for related information.

Exporting the map

Now we have the styled map according to our liking ready. We need to export it to an image so that we can share it across.

  1. Click the “New Print Layout” button. Enter a name, I named mine “export” and click ok.new print layout
  2. You will get the Layout window with an empty page. empty print layout.png
  3. From the menu, select Add Item -> Add map. Click and drag the cursor to the required size. map inserted.png
  4. (Optional) There is a lot of white space around the map inside the box. We can make the map a little bigger by reducing the scale. On the right side switch to the Item Properties tab and reduce the value for Scale. (Mine was 17485874 and I changed it to 12500000). rescaled map.png
  5. Click Add Item -> Add Legend. Click and drag the cursor to create the Legend. India’s maps usually use the Bay of Bengal for that, I am going to do the same. Legend Added.png
  6. You will notice that the legend title says the layer name. But what we really want it to say is “Literacy Rate”. There are two ways to fix that. Choose the one that appeals to you.
    1. On the right in the Item Properties tab, under Main Properties, you can enter a title as “Literacy Rate”
    2. On the right in the Item Properties tab, under  Legend Items, double-click on the layer name and enter “Literacy Rate”
  7. Now there is some extra white space on the right. Let us clean that up. On the right side select Layout tab, scroll down to Resize Layout to content and click Resize Layout. Now the page should have been resized to only the map. cropped to content.png
  8. From menu click Layout -> Export as Image. Enter the filename in your desired location and save it. You could also export as PDF if you want to print.

Note: Apart from just the map and legend you can do a lot more complex things with the layout manager. Again, refer to a QGIS manual and other tutorials on the internet to fully learn about them.

Final Product

IndiaLiteracyMap.png