OSM Mapping with AI from Facebook

“AI is supercharging the creation of maps around the world” is a very interesting development in OSM mapping. It perhaps will reduce the number of hours required to map areas by at least a couple of orders of magnitude.

Automatic feature extraction is nothing new of course. People have been doing it for decades. I should know, I spent two years mastering (supposedly) on Remote Sensing, the domain that deals with capturing information about the earth using satellites and extracting useful information from the captured data. The traditional remote sensing workflow uses techniques like Supervised Classification and Unsupervised Classification for feature extraction. The beauty of machine learning algorithms is that models can be built which can merge these two classification techniques and automate the process.

OSM community AFAIK stayed away from automated data generation mostly because the work involved in cleaning up bad data is more difficult and tedious than in creating new data. It looks like the collaboration from HOT OSM and the development of RapID (enhanced ID), the process put in place ensures that only quality data goes into the system preventing the issues of bad data from happening.

While it is all colourful for the OSM community, the question with profit seekers like Facebook is always what’s the catch? I tend to think there is no catch for the OSM community and that FB has decided to invest some money in the OSM so that they can use the fruits of community work for their benefits. It looks like a mutually beneficial arrangement. The community gets the resources from FB to create a better dataset and FB gets the data for its usage. I am sure FB will extract more value out of their investment in the long run and reap more benefits than the community. They will obviously have their own data layers on top of the OSM data, they will overlay all sorts of tracking data and enhance their capabilities as a surveillance machine. But all that is left to them and their corporate interests.

For now this is a good thing and doesn’t seem to do any harm to OSM.

Internet Dot Org

Yesterday one of my friends posted a message on Facebook saying he supports in InternetDotOrg initiative of Facebook and pointed to a post by Mark Zuckerberg. I had some free time and wrote a eloborate explanation so as to why it is not really a good solution for bringing internet to the unconnected masses. The following text is a repost of the same.


Comment #1:

Hi. First, the issue behind InternetDotOrg is not about internet. It is about control and flow of information.

Let me break it down one step at a time:
Solving last mile internet penetration as you imagine is being done at various levels currently including fibre laying for e-Governance initiatives. This report http://www.indiatelecomonline.com/telecom-penetration-in…/ says 91% of the rural India are connected by at least one mobile operator.

Comment #2:

Next is affordability: Now that is a question we all can ponder. I will take my operator Airtel 2008-09 unlimited 2G data for one month was @98rs, then it was 2GB for 98, then 1GB for 98, then it started climbing up 125, 145 and now it is 175 for 1GB of data. Who is driving the cost up? Is there more demand than supply? Why is 2G data cost rising when Airtel is rolling out 4G?

Comment #3:

I will try to answer them, but I don’t have all answers. Demand – Supply: Yes with smart phones demand has increased comparatively and that too drastically, but instead of moving the users to a higher bandwidth platform like 3G or 4G, telcos are trying to make use of the situation best by increasing prices in a congested bandwidth (is it really congested is a debate in itself :)) If properly billed urban masses can be migrated to 3G networks pretty easily, again due to crappy phones that most buy, 3G battery draining is a bottleneck for such an expansion. So everything is caught up in a muddle, and telcos have chosen the easy way out of charge more on what people use more

Comment #4:

Due to this muddling the once easily available internet has moved farther from the ‘poor and unconnected’ masses that InternetDotOrg says it wants to connect. The actual reality is everyone is connected already, only people don’t know it.

Comment #5:

Now let us talk about this “disadvantaged people of the society who are deprived of opportunity” and how IDotOrg is positioning itself. Disadvantage – Technologically there is none. Economically – YES. So IDotOrg says, it is going to give free access to NOT INTERNET but “..offers free access in local languages to basic internet services like …”, the key difference being internet and internet services. So anyone who partners with FB gets a path to reach the population. If you don’t partner you don’t get a customer. Simple. Personally I am not much worried about this at this point from business stand point and don’t talk Net Neutrality here. At the end of the day, it is a choice, even I might use it, because all I care about internet in my mobile is Whatsapp. Everything else I have my house Wifi.

Comment #6:

What bothers me is the things that we don’t see, while giving a selection of internet services is one way to solve the problem, there are surely many other.

  1. Use public corp like BSNL to provide cheap or free internet in rural areas.
  2. Why not repurpose barely used resources like EduSat satellite?
  3. Why not create a whitespace in the mobile spectrum and leave it free for hyper-local non-profit telcos like a Community Radio?
  4. Use other white space solutions like the ones Microsoft Proposes http://www.business-standard.com/…/govt-to-test-white…
  5. Use something like Google’s Project Loon with a more stationary setup.

All these solutions are driving at providing affordable internet connectivity, not just Internet Services. Nobody owns airwaves, it is not some exhaustible resource. All of the above won’t happen because there is NO MONEY in it and nobody really cares about a sustainable long term solution for affordable internet.

Comment #7:

Facebook is doing it because, user acquisition has plateaued. It is not a startup but a listed corporation. It needs to make shareholders happy, one of the key metrics is user base. The next billion users of FB are not going to come from America or Europe, but from South Asia and Africa. So there needs to be a way to tap into this potential user base, this cannot happen because of the bottleneck – affordability. So by IDotOrg FB is trying to acquire users. You might say it is a win-win situation for both the users and FB. At this point I should agree. But I won’t. Because I am going to apply a simple test here, imagine the most vulnerable person who has zero idea about online privacy, data tracking, focused advt, behaviour tracking, face recognition and other technology used by FB. Now can IDotOrg collect enough data to make this person’s life miserable? Does the person have any idea of what he is giving up for what he is getting? Is it the right price to pay? Are YOU willing to be that person? Are you willing to live in a flat with the perfect view of the garden, but never being allowed to enter the garden? I cannot. I am not willing to be that person. I don’t want my idea of internet, my access to information, and my choice of using a particular web service decided by a for-profit corporation which makes money based on people’s choices and behaviour.

Comment #8:

I would request you to reread Zuckerberg’s post again and really see has he anything to say except marketing crap. I suppose he is doing nothing but selling.

Getting ready for planet CODE

With the end of TeachForIndia Fellowship around the corner, I am contemplating on jumping into the programming world. So here is a checklist of things that I am going to put up in place and document it along the way.

Get a decent mail ID

Though not strictly related to planet code, I wanted to have a decent email id. The one I used ~~aruntheguy at gmail~~ was created when I was 15 and not so professional. Hence I have moved the mail to arunmozhi.in.

Creating a online presence

These days online presence is mostly Social Media. I deleted my Facebook account and have only Twitter, but that alone isn’t sufficient.

So,
Moved blog from static-site Pelican back to wordpress for easy theming and maintanence
Created a home page where I can showcase stuff

Customer Research

Making use of a skillset is all about selling it. So what do the buyers look for? Being a follower of Coding Horror I started of with his post on How to hire a programmer.

Here are the things I learnt:
1. Know to really write a program – I pass
2. Should have a portfolio – Need to create a portfolio page
3. Be culturally fit for the role – Depends on the company
4. Answer computer science theoritical/tricky/nerdy qustions – Well you can’t really prepare for them, can you? May be the theory part, but again when was the last time I thought about hash tables, 3rd year of my college?
5. Expect an audition project or a real world problem to solve.
6. Pitch in front of small group – have mixed feelings about this. But good to know this might be coming.

1 is done, 2 requires some work, 3 can’t really be worked on, 4 is subjective on how we look at it, solving puzzles and learning nerdy jokes is not the point, it shows the recruiter how we approach a problem and how much passionate about programming is a person. I will leave them at that. Finally, the points 5 and 6 are subjective to personal preference and circumstances as outlined in the comments by various people. I am sort of going to forget them for now.

Product preparation

I am the product, in case you are wondering. The article cited above and others linked in that article give a general idea about the product.
With that in mind, and my long time personal goals, here is the list of things I have in mind to do:

  1. Learn touch-typing!! Yeah, I am a pecker – albeit a fast one. – Done
  2. Learn Vim to the extent I am not going into v mode everytime to delete a set of words. – Ever learning
  3. Get core commit rights for QGIS – its time to work on itCouldn’t fit in time.
  4. Create a Portfolio/Resume/HireMe page
  5. Learn some tools of trade:
    • Plain text manipulations – Regex
    • Shell scripting
    • Code Editor – Vim
    • Version Control – Git + Github
    • Debugging Tools
    • Unit Testing Frameworks
  6. …….

I guess I will add more to the list as I set the goals. For now this should keep me focused.

The year of 2014

Basically I attempted two projects as a personal development venture.

52 Weeks 52 Books: Started with the objective of improving the number of books that I read in a year. Read about a total of 27 books in all this year. Better than any other year in my life. Though the project is a failure, I reaped huge satisfaction in the process.

52 Weeks 52 Maps: Started with the aim of creating and uploading maps for Wikipedia. I think I did only 8 in all. Even those were done in the initial one or two weeks. This failed spectacularly.

Hopefully 2015 would be a better year.

Obsession

I have been observing a pattern in my life over the past few months. I am obsessed about something in the evenings and the free time. It was books for a month, Far Cry 3 for another, and has recently turned into Chess.

I am trying to understand the underlying factor which is responsible for this behaviour. After reading through some pages about impact of games on human brain, watching the TED talks like Jane McGonigal: Gaming can make a better world and assuring myself that I am not really going crazy, I think I have a plausible answer.

Like all young people I need to have that sense of achievement.

Being a introvert, the above explanation makes a lot of sense. I am not uploading pics in Facebook, I am not tweeting even an average of 1 tweet/day – other things that could keep me filled with the achievement and appreciation factor I am looking for.

Obsession Hacking

The word hacking is being used in a lot of places where it means “modification” or “change” or “tweak”. I am trying to use it for channeling my obsession into something that could be productive – as in work – as well as supply me the required achievement factor. One activity which I know could do that is – Coding.

Taking a look at what I have done in 2014:

github_dismal

I think I would do what John Resig recommends – Write Code Everyday, starting from today December 1, 2014. Let me see how far the obsession hacking goes.

Update: December 20,2014
Well. This doesn’t seem to be as simple as it seems. Gaming, reading books, chess – all have been entertaining and relaxing. Because that is consumption of content. But coding is production of content, hence has proved to be a much difficult and straining task. I haven’t been able to get to coding at all. The experiment so far has been a big failure.

The Setup

Continuing from the previous post, let me write down everything that defined my work setup.

The Curriculum

This is where the fun starts. I worked with two different curriculum throughout the year. The school wanted me to teach the mandated Samacheer syllabus and the organisation that I work for wanted me to teach to the Common Core Standards. The Samacheer part contained the Social, Science and the “English as a subject” subjects to be taught, the organisation gave me Mathematics  and “English as a standard” subjects to teach. It is actually painful to be a teacher and teach language either as a subject or as a set of standards. More on that separately sometime later (which is almost never).

The Red Ink

There were 37 notebooks for each (3) Samacheer subject to be checked and corrected 3-5 times in every term. Each term itself is about 3 or 4 months. And there where 2 mid-term term tests and 1 end of term test for each term. For the organization’s part, we were supposed to conduct Unit Assessments, which is one in 6 weeks, Weekly assessments and if possible Daily Assessments. I just did the Unit Assessments. Tried Weekly assessments but dropped it after a couple of weeks, it was getting out of hands. English made up for it, by making me correct a set of at least 10 questions every alternate day. I remember sitting, standing, sleeping, walking and even jumping on/off trains with my bag on the shoulder, papers on the left hand and red pen on the right.

The Sessions

The organisation’s way of making sure we are fully equipped to handle everything in the classroom. It was usually planned in the evenings after school when we are in our lowest glucose levels and looking out for a corner to curl. The sessions did make a lot of sense to the people who were organizing them. They were usually about how to teach, how to handle kids, how to understand a particular area in order to deliver it the way it is supposed to be. But one thing no one seemed to care/understand/grasp was there was no single way to do stuff.

The Printer

Canon LBP2900. One trademark of being a TFI fellow is we print more paper for each kid than what government or the school would. Having a laser printer really does help. One can be free of the timing restrictions imposed by the Xerox shops and save a lot more money. I printed about 8000-9000 pages in the last 4 months alone. 1500 rupees for all that paper and 400 rupees for the toner and the immense flexibility of being able to print whatever and whenever.

The Travel

The travel was two/three legged. I usually started off with short bus ride 5E/23C/49 from Adyar depot to Madhya Kailash, took a train from Kasthuribai Nagar station to the Beach Station, and then finally took 44C from Beach Station to the Power House stop. Sometimes the 5E-Train combo was replaced by the 21H/PP19 from Adyar Depot to Parry’s Corner. Initially used 6D from the backside of Adyar Depot, but extra 300m walking and having no alternate buses made me switch to other options. One thing good about the train travel is I always found space to sit and even work on the laptop if required. Having a monthly season ticket for just 105 rupees was another boon. Never had to worry about tickets/queues and oversleeping during return journeys.

 

These define the physical boundaries of how I worked in the past one year. But how did I actually work? What was “The process”?