Blog

Internet Dot Org

Yesterday one of my friends posted a message on Facebook saying he supports in InternetDotOrg initiative of Facebook and pointed to a post by Mark Zuckerberg. I had some free time and wrote a eloborate explanation so as to why it is not really a good solution for bringing internet to the unconnected masses. The following text is a repost of the same.


Comment #1:

Hi. First, the issue behind InternetDotOrg is not about internet. It is about control and flow of information.

Let me break it down one step at a time:
Solving last mile internet penetration as you imagine is being done at various levels currently including fibre laying for e-Governance initiatives. This report http://www.indiatelecomonline.com/telecom-penetration-in…/ says 91% of the rural India are connected by at least one mobile operator.

Comment #2:

Next is affordability: Now that is a question we all can ponder. I will take my operator Airtel 2008-09 unlimited 2G data for one month was @98rs, then it was 2GB for 98, then 1GB for 98, then it started climbing up 125, 145 and now it is 175 for 1GB of data. Who is driving the cost up? Is there more demand than supply? Why is 2G data cost rising when Airtel is rolling out 4G?

Comment #3:

I will try to answer them, but I don’t have all answers. Demand – Supply: Yes with smart phones demand has increased comparatively and that too drastically, but instead of moving the users to a higher bandwidth platform like 3G or 4G, telcos are trying to make use of the situation best by increasing prices in a congested bandwidth (is it really congested is a debate in itself :)) If properly billed urban masses can be migrated to 3G networks pretty easily, again due to crappy phones that most buy, 3G battery draining is a bottleneck for such an expansion. So everything is caught up in a muddle, and telcos have chosen the easy way out of charge more on what people use more

Comment #4:

Due to this muddling the once easily available internet has moved farther from the ‘poor and unconnected’ masses that InternetDotOrg says it wants to connect. The actual reality is everyone is connected already, only people don’t know it.

Comment #5:

Now let us talk about this “disadvantaged people of the society who are deprived of opportunity” and how IDotOrg is positioning itself. Disadvantage – Technologically there is none. Economically – YES. So IDotOrg says, it is going to give free access to NOT INTERNET but “..offers free access in local languages to basic internet services like …”, the key difference being internet and internet services. So anyone who partners with FB gets a path to reach the population. If you don’t partner you don’t get a customer. Simple. Personally I am not much worried about this at this point from business stand point and don’t talk Net Neutrality here. At the end of the day, it is a choice, even I might use it, because all I care about internet in my mobile is Whatsapp. Everything else I have my house Wifi.

Comment #6:

What bothers me is the things that we don’t see, while giving a selection of internet services is one way to solve the problem, there are surely many other.

  1. Use public corp like BSNL to provide cheap or free internet in rural areas.
  2. Why not repurpose barely used resources like EduSat satellite?
  3. Why not create a whitespace in the mobile spectrum and leave it free for hyper-local non-profit telcos like a Community Radio?
  4. Use other white space solutions like the ones Microsoft Proposes http://www.business-standard.com/…/govt-to-test-white…
  5. Use something like Google’s Project Loon with a more stationary setup.

All these solutions are driving at providing affordable internet connectivity, not just Internet Services. Nobody owns airwaves, it is not some exhaustible resource. All of the above won’t happen because there is NO MONEY in it and nobody really cares about a sustainable long term solution for affordable internet.

Comment #7:

Facebook is doing it because, user acquisition has plateaued. It is not a startup but a listed corporation. It needs to make shareholders happy, one of the key metrics is user base. The next billion users of FB are not going to come from America or Europe, but from South Asia and Africa. So there needs to be a way to tap into this potential user base, this cannot happen because of the bottleneck – affordability. So by IDotOrg FB is trying to acquire users. You might say it is a win-win situation for both the users and FB. At this point I should agree. But I won’t. Because I am going to apply a simple test here, imagine the most vulnerable person who has zero idea about online privacy, data tracking, focused advt, behaviour tracking, face recognition and other technology used by FB. Now can IDotOrg collect enough data to make this person’s life miserable? Does the person have any idea of what he is giving up for what he is getting? Is it the right price to pay? Are YOU willing to be that person? Are you willing to live in a flat with the perfect view of the garden, but never being allowed to enter the garden? I cannot. I am not willing to be that person. I don’t want my idea of internet, my access to information, and my choice of using a particular web service decided by a for-profit corporation which makes money based on people’s choices and behaviour.

Comment #8:

I would request you to reread Zuckerberg’s post again and really see has he anything to say except marketing crap. I suppose he is doing nothing but selling.

Python mock – Handling namespaces

Unit testing a big python application comes with its own set of worries which includes mocking calls to parts of code which we will test somewhere else.

Let us say I have a **utils.py** (every project has one anyways)

# utils.py
def word_length(name):
    # here it is a trivial function
    # assume that this is a costly network/DB call
    return len(name)

 
And I have another module which uses word_length using the from module import function syntax.

# user.py
from project.utils import word_length

def calculate(name):
    length = word_length(name)
    return length

Now I want to unit test all the functions in **user.py** and since I am going to test just the user module and want to avoid costly calls made by my utils module I am going to mock out calls to `word_length` using Python mock library.

# test_user.py
from project.user import calculate

from mock import patch

@patch('project.utils.word_length')
def test_calculate(mock_length):
    mock_length.return_value = 10
    assert calculate('hello') == 5

One would expect this assertion to fail because we have mocked out the word_length to return 10. But this passes and our mock is not working. Why? Because **NAMESPACE**. Here we have patched the function in the namespace utils. But we have imported the function to the user.py namespace by using `from module import function`. So we need to patch the function in the user namespace where it is used and not in the utils where it is defined. So change the line


@patch('project.utils.word_length')

# to

@patch('project.user.word_length')

But what if we have used simply like

# user.py
import utils

def calculate(name):
    length = utils.word_length(name)
    return length

This time we can straight away use the @patch('project.utils.word_length') as we are importing the entire module and namespace remains as such.

Python Pitfalls

I was woken up today with the following question:
[python]
def foo(x=[]):
x.append(1)
return x

>>> foo()
>>> foo()
[/python]

What could be the output? The answer is

[1]
[1, 1]

I was stupefied for a minute before I started DuckDuckGo-ing Python default arguments, Python garbage collection, Python pitfalls..etc.,

These links helped me understand mutable objects’ memory management.
Deadly Bloody Serious – Default Argument Blunders
Udacity Wiki – Common Python Pitfalls
Digi Wiki – Python Garbage Collection

Travis CI config for Python + NodeJS

I currently work with the Gluu Federation – an open source company. My present job is to build a Web UI for the cluster management using the available API.

We use Python Flask for the Backend and AngularJS+Bootstrap for the Frontend. Pretty much a standard affair these days. Testing is crucial and we have the backend tests in Python Nose and the Frontend Unit tests using Karma and Jasmine running on NodeJS.

Now the question is how does one configure the Travis CI .yml file to perform tests in a both Python and Javascript. I initially set up using language: node_js and ran only the frontend tests. Then I went around searching and stumbled on travis-ci – issue#4090 – Support multiple languages. Thanks to the tip from @BanzaiMan I now have both frontend and backend tests running with a simple yml file.

The Social Media Dilemma

Woke up today morning and saw this piece on High Frequency Trading. I have already read Michael Lewis’Flash Boys and The Big Short which has kindled a strong dislike towards this HFT, not before dreaming about millions by doing it. When I hit this

We tend to understand the concept of actively using technology to achieve certain ends (exercising agency), but we find it harder to conceptualise the potential loss of agency that technology can bring. It’s a phenomenon perhaps best demonstrated with email: I can use email to exercise my agency in this world, to send messages that make things happen. At the same time, it’s not like I truly have the option to not use email. In fact, if I did not have an email account, I would be severely disabled. There is a contradiction at play: The email empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

my mind automatically went to one point of long time consternation – Facebook. With the recent reports of both Google and Facebook pushing for face recognition technologies and even people at MIT Technology Review are wondering what it means and how people take it, I am more spooked on the issue than I usually am.

The Dilemma

I am trying to setup something which requires inputs from a number of people intellectually, monetarily and personally. And one place where all these people could be reached out easily, coordinated, and followed upon is Facebook. There is no denying it. I have seen a lot of groups coordiante a lot of stuff over there before I deleted my account. It has become so ubiquitous in everyday lives of millions of people that people like me will be looked down as the ‘anti-vaxers’ of the digital connectivity. I am also finding it increasingly hard to explain why I don’t have account, more so when it comes to why I deleted it.

For most people the perceived threat of someone owning our identity in some distant place as one among the millions far outweighs the benefit of being ‘connected’ to friends and family.

The email empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

This sounds so much like

The Facebook empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

The compromise of having to let facebook’s scripts stalk me, monitor me, and feed me what it thinks is good for me in order for me to setup and run things I want to has resulted in a dilemma like no other.

Probable Ways Out

  • The old way: Restrain from going to FB and do things the old way. Which is call the first person you know, get to know about the next person and from him the next one. So on so forth. While it is entirely doable, it does involve retelling the same story multiple number of times.
  • Be the hypocrite: Let somebody else do the co-ordination on platforms like Facebook like celebrities do. I don’t think I am that wealthy or famous to hire a SoMe firm. It also involves being a hypocrite for using Facebook and calling it a bad thing.
  • One among the millions: Throw out all reservations and jump into FB. Let whatever hits the millions hit me.

All the three is equally straining for a variety of reasons. And is killing me.

 

Thattachu – Open Source Typing Tutor

Typing tutor is a known ancient domain to work on. There are a number of places online/offline, tangible/intangible places to learn typing. But Srikanth (@logic) stumbled on a peculiar problem when worked for the Wikimedia Language Engineering team. The new age Indic input methods involved in computers seem to have no place to learn how to type on them. The only way seems to be – have a visual reference for the layout and begin typing one key at a time. This might be the most inefficient method of learning to input information. So what do we do?

Enter Thattachu

Thattachu is an open source typing tutor. It is built using the tool that Wikimedia Language Engineering Team have developed called jQuery IME. jquery.ime currently supports 62 languages and 150+ input methods. This is a JavaScript library which can be used on any web page. So we (I & Srikanth) set out to build a generic typing tutor which could employ any of the 62 languages or 150+ input methods. The project was conceived in May 2014 and was worked on only by May 2015 as I was busy with my Teach For India Fellowship. Thattachu borrows its tutor style from GNU Typist or gTypist which I used to learn touch typing in English.

Interface

Thattachu has three pages:

  1. Home page – A welcome page for those visiting the site and explaining what it is about.Thattachu_page1
  2. Course Selector – A place where you choose the course to learn. You select the language and the input method you want to learn and it lists the available courses.Thattachu_page2
  3. Workbench – A place where you practice typing. When you select a course in the Course Selector, the workbench loads with the course you selected and you can begin typing with the input method you chose. It remembers your most recent course and lesson so you can continue from where left it the previous session.Thattachu_page3

Course Structure

Each language has a set of input methods – each input method has a set of courses. The courses are classified based on their difficulty as “Beginner”, “Intermediate” and “Expert”. Each course has a set of lessons to complete and and each lesson is a collection of lines that have to be typed.

thattachu_courses

Thattachu Asiriyar

Creating the tool is the easier part of a content dependent system. The real work is generating the content that the tool can be used with. That way we faced the challenge of creating course.JSON files required for the tool. Hence a user friendly tool Thattachu Asiriyar was born.

Thattachu Asiriyar lets anyone author a course and generate a course file. If you want to author courses, go to Thattachu Asiriyar create the course file and mail it to
arun [at] arunmozhi [dot] in -mentioning “Thattachu course” in the subject.

Github savvy authors

Or if you have a Github account and know about pull requests. Kindly

  1. Fork the Thattachu repohttps://ghbtns.com/github-btn.html?user=tecoholic&repo=thattachu&type=fork&count=true
  2. Put the course file into the data/language_code folder
  3. Update the courselist.json in your folder with the metadata and the filename
  4. Send me a pull request.
  5. Feel awesome for helping the humanity learn typing

Developers

Here are a few points for those interested in the code or those who think they can improve Thattachu.

  • Thattachu is a web application written in HTML and JavaScript (AngularJS).
  • It is a completely static site with all the information stored as JSON files and served by XHR requests when requested by the Angular $http.
  • For input jQuery.ime is used.
  • It is a completely static site and can be hosted in any web server.
  • It uses localStorage of the user to track last worked on course and load it when the user opens the page next time.

Zimbalaka – Zim file creator for Offline Wikipedia

OpenZim is a Wikimedia developed format for offline reading of Wikipedia. Read more here. But the project was sadly sidelined and the support from MediaWiki, the software that runs Wikipedia sites, was also removed.

I came to know about all this from Bala Jeyaraman of Vasippu. He is planning to introduce tablets in a classroom of 6th standard students, with exceptional comprehension levels compared to average Indian classrooms, and wanted a way to load select material into the tablets. The OpenZim files have an excellent reading app called Kiwix, which also offers complete Wiki sites as downloads. Tablets can’t afford to have a huge amount of data, like full Wikipedia. There is no way to create a zim file with select topics. One has to request the OpenZim team to do it for him/her.

Enter Zimbalaka

Zimbalaka is a project which tries to solve just that. It creates offline wikipedia content files in zim file format. A person can input a list of pages that need to be created as a zim, or at least a Wikipedia category. Then Zimbalaka downloads those pages, removes all the clutter like sidebar, toolbox, edit links …etc., and gives a cleaned version as a zim file for download. It can be opened in Kiwix.

The zim is created with a simple welcome page with all the pages as a list of links. The openzim format also has an inbuilt search index and Kiwix uses this really well. So you can create zims of 100 articles and still navigate to them easily either way.

Zimbalaka has multi-lingual and multi-site support. That is, you can create a zim file from pages of any language of the 280+ existing Wikipedias, and also from sites like Wikibooks, Wiktionary, Wikiversity and such. You can even input any custom URL like (http://sub.domain.com/), Zimblaka would add (/wiki/Page_title) to it and download the pages.

It is currently hosted by my good friend Srikanth (@logic) at http://srik.me/zimbalaka

Screenshots

Here is how the content looks in Kiwix for Android.

navigate

multi

Pain points

  • A small pain point is that Zimbalaka also strips the external references that occur at the end of the Wikipedia articles, as I didn’t find it useful in an offline setup.
  • You cannot add a custom Welcome page in the zim file. Not a very big priority. The current file does its work of listing all the pages
  • You cannot include pages from multiple sites as a single zim file. The workaround is to create multiple files or use a tool called zimwriterfs, which has to be compiled from source (this is used by zimbalaka behind the scenes).

Developers

This tool is written using Flask – A simple Python web framework for the backend, Bootstrap as the frontend and uses the zimwriterfs compiled binary as the workhorse. The zimming tasks are run by Celery, which has been automated by supervisord. All the coordination and message passing happen via Redis.

Do you want to peek in how it is all done? Here is the source code [https://github.com/tecoholic/Zimbalaka]. Feel free to fork, modify and host your own instance.

Update

The OpenZim team has appreciated the effort I had put in and offered to host the tool on their server at http://zimbalaka.openzim.org. They have also pointed me to the desired backend called ‘mwoffliner’ that they have developed to download and clean the HTML. I will be working on it in my free time.