Data Science – First Impressions

After some thoughts on what to learn next, I enrolled myself into IBM Data Science Professional Certificate program. It is beginner level program to provide the necessary foundations for Data Science. I have completed 3 courses so far:

  1. What is Data Science?
  2. Tools for Data Science
  3. Data Science Methodology

I liked all the courses so far. Despite IBM Cloud tools’ big presence, the courses cover the concepts in a somewhat generic way.

This post is NOT a review of the course or certification.

This post is mostly about my first impressions on the domain of Data Science.

1. It’s not strictly a science

Data Science doesn’t have a clear definition. Everyone defines it the way they want. Murtaza Haider – author of Getting Started with Data Science puts it simply as

Data Science is what data scientists do

It is not statistics either, so I guess instead of inventing a new word for experimental statistics, data analysis, visualisation and model building, someone called it science.

2. Engineers will probably hate it

If there is one underlying principle that defines engineers, it is their love for certainty. Engineers strive to build systems governed by a set of rules that will produce predictable outcomes. Data Science is actually quite the other way around, you start with a question and move towards an answer which could be anything, valid or invalid. It is a frustrating experience to go the full way without knowing where. Sure, we get some hints here and there throughout the process. If one has the “journey is more important than destination” attitude, it is a good ride. But if you yearn for certainty or get frustrated mid way through the process, you will probably end up torturing the data to confess the way you want it to.

3. The Jargon

It is a field of jargon. Everything has a specific word/phrase. If you open the data in excel and see it, you will probably call it inspecting the data. As a data scientist, you do the same with a couple of lines of Python Code on a Jupyter Notebook and call it “Exploratory Data Analysis” or EDA, if you change value or format them to a specific standard is “Data Manipulation” or DML (did that thing even need a 3 letter acronym?). If you query some data, do some manipulations, it is a Extract-Transform-Load or ETL pipeline.

While some of them are valid terminologies (like ETL) which exist to communicate effectively, some of them are just marketing jargon which exists to make stuff seem bigger than they are.

4. The Ah..ha moments

This is the best thing about Data Science according to me. It is those moments when your perception is altered by the data or when the output is altered by the perception. Engineering by its nature is application of established principles, we get our ah..ha moments when we learn a new concept invented by a scientist. But in data science the data can produce such moments by their mere nature of being exploratory in nature. Since we start with a question and follow the data to the answer, it usually results in a light bulb moment.

5. Practicality

This is the area which is most conflicting for me. Things like “Data is the new oil” has been said for close to a decade now and the general consensus seems to be that more data the better. But both my personal experience as a teacher and the case study about a health insurance provider in the course have made me wary of the practicality of this approach. I can write about my personal experience as anecdotal evidence. But I think I would give a more data oriented reason (the post being about data science and all).

Explosion of healthcare administrators in the US

While studying the data science methodology, it becomes acutely clear how this happened. I am not saying data science is the root cause of the problem, but it is definitely a contributing factor as no model built by any data scientist can be static. It is iterative process which keeps cycle of data collection modelling and output going. So when overdone, it feels more of a problem than a solution.

6. The machines are coming for the data scientists

The hype for data scientists went mainstream, I think with the McKinsey report of 2018 saying there will be a shortage of 140,000 to 190,000 data professionals in US alone. Followed by data scientists becoming the highest paid in technical jobs. Going by IBM tools that I have used during the course, I don’t think the hype will age well.

Money is in automation – an estimated 70 to 90% time is spent by data scientists in collecting and preparing the data for the analysis. If there is anything that computers were invented for, it is to automate such mundane processes. Even if a company is saving 50% of that time, it is great cost saving. So automation tools will cut down the work and thus the demand for data scientists.

Side Note: You can create a free account at cloud.ibm.com and checkout their tools to get an idea of the direction and sophistication of the tools that are being developed.

Uncertainty of the outcome – as mentioned earlier, not all organisations will gain from having a data science team. Apart from the data, a variety of things like – organisation size, scope for data driven decision making and the talent of the data science team all impact the outcome of the exercise. Combined with off the shelf offerings from IT companies, data scientist’s role might shrink to just an computer operator in some cases.

No, I am not saying data scientists will become obsolete. Just that it is not going to live up to the hype.

Before you throw brick bats

I have completed only 3 of 9 courses in the program. Once I complete another 3 courses, I think I will have a better understanding of things and will revisit the topic again.

text/plain MIME Type and Python

When you do echo "x" > my_file and then check its MIME type using file --mime-type my_file it would say text/plain. But, when you do the same in Python by

with open("my_file_2", "w") as fp:
fp.write("x")

and then check the MIME type it would say application/octet-stream. What’s the difference?

For the impatient

echo adds a new line to file which tells the file utility it is a text file.

For the curious

When I saw this question on StackOverflow, I was really stumped due to the following reasons:

  1. I didn’t know the file utility can be used to get the mime-type of the file. I thought MIME Type is only relevant in the context of web server and clients. After all, MIME stands for Multipurpose Internet Mail Extensions
  2. I thought operating systems usually use the file extension to decide the file type, by extension the mime type. Don’t the OSes warn when we touch the extension part of the files while renaming, all the time? So, how does file utility do this on a files without any extension?

Adding extensions

Lets try adding extensions:

$ echo "x" > some_file.txt
$ file --mime-type some_file.txt
some_file.txt: text/plain

Okay, that’s all good. Now to the Python side:

with open("some_file_2.txt", "w") as fp:
fp.write("x")
$ file --mime-type some_file_2.txt
some_file_2.txt: application/octet-stream

What? file doesn’t recognise file extensions?

The OS conspiracy theory

Maybe echo writes the mimetype as a metadata onto the disk because echo is a system utility and it knows to do that and in Python the user (me) doesn’t know how to? Clearly the operating system utilities are a cabal of some forbidden knowledge. And I am going to uncover that today, starting with the file utility which seems to have different answers to different programs.

How does ‘file’ determine MIME Type?

Answers to this question has some useful information:

How do you change the MIME type of a file from the terminal?

  1. MIME Type is a fictional value. There is no inherent metadata field that stores MIME Types of files.
  2. Each operating system uses a different technique to decide file type. Windows uses file extension, Mac OS uses type creator & type codes and Unix uses magic numbers.
  3. The file command guesses file type by reading the content and looking for magic numbers and strings.

Time to reveal the magic

Let us peer into the souls of these files in their purest forms where there is no magic but only 1s and 0s. So, I printed the binary representation of the two files.

$ xxd -b my_file
00000000: 01111000 00001010 x.

$ xxd -b my_file_2
00000000: 01111000 x

The file generated by echo has two bytes (notice the . after the x) whereas the file I created with Python only has one byte. What is that second byte?

>>> number = int('00001010', 2)
>>> chr(number)
'\n'

And it turns out like every movie on magic, there is no such thing as magic. Just clever people putting new lines to tell file it is a text file.

Creating a trick

Now that the trick is revealed, lets create our own magic trick

$ echo "<file></file>" > xml_file
$ file --mime-type xml_file
xml_file: text/plain

$ echo "<?xml version="1.0"?><file></file>" > xml_file
$ file --mime-type xml_file
xml_file: text/xml

Useful Links

  1. https://www.baeldung.com/linux/file-mime-types
  2. https://unix.stackexchange.com/questions/185216/file-command-apparently-returning-wrong-mime-type
  3. https://stackoverflow.com/questions/29017725/how-do-you-change-the-mime-type-of-a-file-from-the-terminal

Building a quick and dirty data collection app with React, Google Sheets and AWS S3

Covid-19 has created a number of challenges for the society that people are trying to solve with the tools they have. One such challenge was to create an app for data collection from volunteers for food supply requirements for their communities.

This needed a form with the following inputs:

  1. Some text inputs like the volunteer’s name, his vehicle number, address of delivery..etc.,
  2. The location in geographic coordinates so that the delivery person can launch google maps and drive to the place
  3. A couple of photos of the closest landmark and the building of delivery.

Multiple ready made solutions like Google Forms, Zoho Forms were attempted, but we hit a block when it came to having a map widget which would let the location to be picked manually, and uploading photos. After an insightful experience with CovidCrowd, we were no longer in a mood to build a CRUD app with Database, servers..etc., So the hunt for low to zero maintenance solution resulted in a collection of pieces that work together like an app.

Piece 1: Google Script Triggers

Someone has successfully converted a Google Sheet into a writable database (sort of) with some Google Script magic. This allows any form to be submitted to the Google Sheet and the information would be stored in the columns like in a Database. This solved two issues, no need to have a database or a back-end interface to access the data.

Piece 2: AWS S3 Uploads from Browser

The AWS JavaScript SDK allows direct upload of files into buckets from the browser using the Congnito Credentials and Pool ID. Now we can upload the images to the S3 bucket and send the URLs of the images to the Google Sheet.

Piece 3: HTML 5 Geolocation API and Leaflet JS

Almost 100% of this data collection is going to happen via a mobile phone, to we have a high chance of getting the location directly from the browser using the browser’s native Geolocation API. In a scenario where the device location is not available or user has denied location access, A LeafletJS widget is embedded in the form with a marker which the user can move to the right location manually. This is also sent to the Google Sheets as a Google Maps URL with the Lat-Long.

Piece 4: Tying it all together – React

All of this was tied together into a React app using React hook form with data validation and custom logic which orchestras the location, file upload ..etc., When the app it built it results in a index.html file and a bunch of static CSS and JS files which can be hosted freely as Github Pages or in an existing server as a subdirectory. Maybe even server over a CDN gzipped files, because there is nothing to be done on the server side.

We even added things like image preview in the form so the user can see the photos he is uploading on the form.

resource_form

Architecture Diagram

resource_form_architecture

Caveats

  1. Google Script Trigger Limits – There is a limit to how many times the Google Script can be triggered
  2. AWS Pool ID exposed – The Pool ID of with write capabilities is exposed to the world. If there is someone smart enough and your S3 bucket could become their free storage or if you have enabled DELETE access, then lose your data as well.
  3. DDOS and Spam – There are also other considerations like Spamming by watching the Google Script trigger or DDOS by triggering with random requests to the Google Script URL that you exhaust the limits.

All of these are overlooked for now as the volunteers involved are trusted and the URL is not publicly shared. Moreover the entire lifetime of this app might be just a couple of weeks. For now this zero maintenance architecture allows us to collect custom data the we want.

Conclusion

Building this solution showed me how problems can be solved without having to write a CRUD app with a admin dashboard every time. Sometimes a Google Sheet might be all that we need.

Source Code: https://github.com/tecoholic/ResourceForm

PS Do you know Covid19India.org is just a single Google Sheet and a collection of static files on Github Pages? It servers 150,000 to 300,000 visitors at any given time.

Lottie – Amazing Animations for the Web

15549-no-wifi

Modern websites come with some amazing animations. I remember Sentry.io used to have an animation that showed packets of information going through a system and it getting processed in a processor.etc., If you browse Dribble you will see a number of landing page animations that just blow our mind. The most mainstream brand that employs animations is Apple. Their web page was a playground when they launched Apple Arcade.

Sidenote: Sadly all these animations vanish once the pages are updated. It would be cool if they could be saved in some gallery when we can view them at later points in time.

We were left wondering how do they do it?

animation_discussion

I might have found the answer to this. The answer could be Lottie.

What is Lottie? The website says

A Lottie is a JSON-based animation file format that enables designers to ship animations on any platform as easily as shipping static assets. They are small files that work on any device and can scale up or down without pixelation.

Go to their official page here to learn more. It is quite interesting.

Take a peek at the gallery as well, there are some interesting animations that can be downloaded and used in websites for free as well.

Moving back from Mac to Windows + Linux

Content Warning: Rant ahead

As my Macbook Air is becoming more and more restrictive to the things I could do, due to low memory of 4 GB and 128GB SSD, I decided to buy a new laptop with better specifications. After some filtering and comparison on Flipkart and Amazon, I finally settled on Lenovo S540 14″ with 8 GB RAM and 1TB SSD. It also came fitted with a 2GB Graphics card which I think will help working with ML algorithms easier. While the hardware is great for my requirement, the software is a complete let down.

Issue 1: Windows Font Rendering is Crap

The screen is a full HD 1920×1080 display in 14 inches. One would think the display would try to match that of the display of my Macbook Air (1440×900), but nope. Not in a million chance.

The system recommends a scaling of 150% for good results, anything below that the system font Calibri starts breaking down and there seems to be no anti-aliasing effect.

There are a couple of solutions to this problem, like setting the scaling to 100% and increasing the font size separately. This works to a certain degree, but doesn’t achieve the smoothness of 150% scale.

Now I have an interface that seems to be adjusted for my Grandma’s failing eye sight.

Issue 2: Microsoft loves Linux – My Feet

I think the whole MS loves Linux non-sense started almost the same time I bought a Macbook. So I never experienced what it meant. I now get what it meant, they wanted to sell Linux machines on their Azure cloud and that’s about it. Whatever contributions they must have done, should have centered around that goal. Because installing Linux in a Windows 10 machine is more difficult now than it was 5-8 years ago. Back then, it was just a matter of knowing how to partition disks and ability to choose the boot disk. Now I had to:

  • Create the bootable disk in a specific format for UEFI compatibility
  • Run a command to change the Storage access method from RST to ACHPI
  • Go into BIOS and disable Secure Boot, and change the ACHPI
  • Boot into Safe Mode so that the disk can work with changed storage
  • Finally boot into install disk and install.

What should have taken me 15-30 minutes took me 2 and a half hours.

Issue 3: Windows 10 is a Data Collection Pipeline

I am really horrified at the number of buttons that I had to turn off during the setup process and I still find as I use the system.

Issue 4: Application Management in Windows

Windows Store is a disaster, I don’t know what is installed in my system and what isn’t. There are tiles for games that aren’t installed and there is no way to differentiate between a tile of an installed application and a tile of a shortcut for an application that is recommended for install.

Issue 5: Why are tiles in Start Menu?

With 150% scale, it always feels like I am seeing only a part of the actual screen when the tiles come up. I don’t understand how MSFT understood that they should go back to the start menu but decided they will keep the tiles nonetheless. Either tile or don’t, consistency please, the mashup is nuisance and everybody should just learn to live with it.

Issue 6: Application Management in Ubuntu

So everybody has been bit by the centralised application distribution model. But tell me which serious software actually gets published? At least none of the ones I seem to use, even in the Mac OS ecosystem which started the stores concept. MS Office, Adobe Creative Suite, IDEs like PyCharm, Android Studio, Eclipse, Browsers… everything is package download from vendor sites. But that hasn’t stopped Canonical from creating Snap store. Now I seriously don’t know why there is a software centre and also a Snap Store and there is good old apt package manager.

The Good Bits in Linux

It’s been 24 hours of hell with the new system. Yet, not everything is bad.

  • Once up and running, I haven’t encountered Wifi or Bluetooth driver issues.
  • The Kernel seem to be pretty stable.
  • Grub has themes and OS selection is stylish.
  • Memory usage is pretty low
  • Font rendering and antialiasing is spot on. I think I just need some time to get used to 16:10 to 16:9 aspect ratio
  • The drivers for the Graphics card are in place
  • Tap to click and Natural Scrolling keep my UX is same across both my machines

Conclusion

After a frustrating 24 hours of the setting up the system. I have completely given up on Windows. As ususal Linux will be my primary OS. Will turn to Windows for recording tutorial videos or when collaboration required MS Office, or maybe games. If money wasn’t an issue, I don’t think I would have moved from Mac to PC at all. Things like 3 finger application switching, desktop switching are still etched in me. So, personally I prefer

  1. MacOS
  2. Ubuntu
  3. Windows… I would try my best not to boot this thing.

OSM Mapping with AI from Facebook

Facebook and HOT OSM have come together to use machine learning to make maps better. It looks like a mutually beneficial collaboration that will help OSM

“AI is supercharging the creation of maps around the world” is a very interesting development in OSM mapping. It perhaps will reduce the number of hours required to map areas by at least a couple of orders of magnitude.

Automatic feature extraction is nothing new of course. People have been doing it for decades. I should know, I spent two years mastering (supposedly) on Remote Sensing, the domain that deals with capturing information about the earth using satellites and extracting useful information from the captured data. The traditional remote sensing workflow uses techniques like Supervised Classification and Unsupervised Classification for feature extraction. The beauty of machine learning algorithms is that models can be built which can merge these two classification techniques and automate the process.

OSM community AFAIK stayed away from automated data generation mostly because the work involved in cleaning up bad data is more difficult and tedious than in creating new data. It looks like the collaboration from HOT OSM and the development of RapID (enhanced ID), the process put in place ensures that only quality data goes into the system preventing the issues of bad data from happening.

While it is all colourful for the OSM community, the question with profit seekers like Facebook is always what’s the catch? I tend to think there is no catch for the OSM community and that FB has decided to invest some money in the OSM so that they can use the fruits of community work for their benefits. It looks like a mutually beneficial arrangement. The community gets the resources from FB to create a better dataset and FB gets the data for its usage. I am sure FB will extract more value out of their investment in the long run and reap more benefits than the community. They will obviously have their own data layers on top of the OSM data, they will overlay all sorts of tracking data and enhance their capabilities as a surveillance machine. But all that is left to them and their corporate interests.

For now this is a good thing and doesn’t seem to do any harm to OSM.

The Social Media Dilemma

Woke up today morning and saw this piece on High Frequency Trading. I have already read Michael Lewis’Flash Boys and The Big Short which has kindled a strong dislike towards this HFT, not before dreaming about millions by doing it. When I hit this

We tend to understand the concept of actively using technology to achieve certain ends (exercising agency), but we find it harder to conceptualise the potential loss of agency that technology can bring. It’s a phenomenon perhaps best demonstrated with email: I can use email to exercise my agency in this world, to send messages that make things happen. At the same time, it’s not like I truly have the option to not use email. In fact, if I did not have an email account, I would be severely disabled. There is a contradiction at play: The email empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

my mind automatically went to one point of long time consternation – Facebook. With the recent reports of both Google and Facebook pushing for face recognition technologies and even people at MIT Technology Review are wondering what it means and how people take it, I am more spooked on the issue than I usually am.

The Dilemma

I am trying to setup something which requires inputs from a number of people intellectually, monetarily and personally. And one place where all these people could be reached out easily, coordinated, and followed upon is Facebook. There is no denying it. I have seen a lot of groups coordiante a lot of stuff over there before I deleted my account. It has become so ubiquitous in everyday lives of millions of people that people like me will be looked down as the ‘anti-vaxers’ of the digital connectivity. I am also finding it increasingly hard to explain why I don’t have account, more so when it comes to why I deleted it.

For most people the perceived threat of someone owning our identity in some distant place as one among the millions far outweighs the benefit of being ‘connected’ to friends and family.

The email empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

This sounds so much like

The Facebook empowers me, whilst simultaneously threatening me with disempowerment if I refuse to use it.

The compromise of having to let facebook’s scripts stalk me, monitor me, and feed me what it thinks is good for me in order for me to setup and run things I want to has resulted in a dilemma like no other.

Probable Ways Out

  • The old way: Restrain from going to FB and do things the old way. Which is call the first person you know, get to know about the next person and from him the next one. So on so forth. While it is entirely doable, it does involve retelling the same story multiple number of times.
  • Be the hypocrite: Let somebody else do the co-ordination on platforms like Facebook like celebrities do. I don’t think I am that wealthy or famous to hire a SoMe firm. It also involves being a hypocrite for using Facebook and calling it a bad thing.
  • One among the millions: Throw out all reservations and jump into FB. Let whatever hits the millions hit me.

All the three is equally straining for a variety of reasons. And is killing me.

 

People’s Mobile

Caution: This is a bit wild man.

Here is my idea.

Open a part of the spectrum used for mobile communication for public non-commercial use

Really!? What to do with it?

My plan is to run a community/volunteer/enthusiast/philanthropist sponsored mobile network which is free for everybody to use. That is, if you could bear the initial installation charges. If we could put enough towers in enough places, we could all talk to each other for free, send messages for free, access internet for free throughout our lives. And we can put an end to all this noise that projects messaging apps as technology disruption, we can engage in more serious pursuit that shout on the road internet for net neutrality, do away recharge coupons, payment gateways like PayTM, freeCharge etc., …… oh my, we can actually do away with a lot of unnecessary stuff.Then, we would have an open internet with all the bandwidth that the technology could offer -say “bye bye data plans”.

Wonderful!

Exactly. Isn’t this the best thing you have heard in a while.

Yay, I am the man from an utopian future.

Sadly this won’t happen – Money.

Net-neutrality. This is why.

This post is a reply to the previous post by me titled Why Net-Neutrality? or Why not tiered pricing?.

After a hour long discussion and debate with a couple of people, these are the things that tell me why net-neutrality is essentially a social issue and not a capitalist one. I am going to outlay the arguments that I made in the previous article and try to point out the flaws and get a better understanding.

1. Greed of companies like Netflix and YouTube

The flaw with the argument “since these services use the ISP’s pipe for their service, they owe a part of income to the ISP” is akin to arguing a runner ran through this road to win his marathon and thus owes a part of the prize to the local government.

The point that it is streaming that clogs the pipe is countered by the argument that it is precisely because of those services, the size of the pipe is getting bigger and definition of broadband internet getting revised to higher speeds.

2. The cost of usage

I have argued the one difference between electric company and ISP is the cost of usage. While an electric equipment doesn’t mean recurring income to the manufacturer, a web service gets an income. Thus there is cost involved in usage of internet services which the ISPs want a slice of.In short, the ISPs are greedy.

Because the initial cost for both the internet and electric connection are very similar. Replace the electric line with a coaxial or CAT line, replace the local transformers with switches and repeaters, replace the high tension substation transformers with the servers of ISP, we have almost exactly the same setup except only electricity is an utility and internet is not

3. More speed more money

I argued why would any ISP want to create a slow lane if they make more money for more speed / more volume. Simply put, why would there be a fast lane and a slow lane?

It seems that it is exactly for the same reason “more speed/volume more money”. In a situation when the ISP can more money why would he be willing to give out a connection involving the same initial costs for a low volume low income connection? The flaw here is assuming that volume/speed is limited and the ISPs are really worried about their precious resource from getting sucked up by the streamers. I should have known better.

Finally

I previously concluded,

I as a layman consumer is completely ok with the current state of affairs and don’t mind if the billing becomes usage based, or someone creates a fast-lane for those who pay more as long as the slow lane is at the mandated minimum speed, which is the present case anyways.

Though I still say the same, I would say it in fewer words – I want a neutral net, which is the present case anyways 🙂

Why Net-Neutrality? Or Why not tiered pricing?

Update: Kindly read the follow up article Net-neutrality. This is why.

I have genuine question. Why net neutrality? I know there are reams of pages of content answering the question, but I am asking from a completely lay man non-techie version of the question. Why? The only answer that I seem to get is – It protects the interests of the consumer.

Interest or Greed?

The current pricing model used by the internet service providers is something akin to an electric connection. They give you a line to use and bill based on the connection speed and the quantity of data or amount of time you spend online. The electric company doesn’t worry whether you use a geyser or lamp as long as you pay the bill. Similarly the present ISP models doesn’t care whether you stream YouTube or read Wikipedia. The analogy seems extremely apt, except for one difference – the cost of usage.

In case of an electric connection, you buy any equipment that you want to use and pay the cost of the hardware. The more you use the more the electric company gains, the equipment manufacturer gets nothing most of the time. Exceptions like cable TV exist, but again cable is another utility altogether and we will get into that shortly. But in case of the ISP, you buy services over its line which earns money for those services. In this case it is completely opposite; the ISP gains almost no money no matter how much you use any kind of service over their lines.
For example, using geyser on a daily basis could consume a lot of energy and prove to be great revenue for the electric company, whereas streaming movies 24×7 is going to benefit only Netflix of YouTube and does nothing to the ISP except choking their lines.

The ISPs this want to take a part of the money paid to the service as it also utilizes on their resource. The invention of thing like Fair Usage Policy (FUP) is a way to limit this choking of the lines.
So when companies like Google talk about net neutrality (against tired pricing), the real question is whether the company is really trying to protect the consumer or just trying to enjoy the free ride that it currently enjoys.

The Cable TV Pricing

From a consumer standpoint this is certainly against the interest because we don’t really want to pay more for certain services. It is time we visit the cable TV evolution and see what the consumer really has said and done about it.

Initially there were there was only one way of getting satellite TV – Using a Dish Antenna. The intial investment was huge. Then came cable operators, who charged a certain amount of fees for certain number of channels, no choices, all just paid an amount. After that came the Direct to Home (DTH) services replacing the cable TV operator with a corporate body. Until to this day we pay per channel, called packages. And the customer has hopped on merrily with some noises here and there. Nothing in the transition has affected the channels’ (services) revenue.

Applying the same route, we are currently in the cable operator state, where the ISP is giving us a cable, we pay an amount. If the model evolves into one which prices textual data, VOIP, and media streaming at different rates, like that of a channel based DTH pricing, and bills the consumer on what is consumed, a lot of people would might end up paying a lot more as well as a lot less. Just as I avoid all sports channels, I might avoid all VOIP usage. If I am programmer, most of my requirement is software package and code, there is no need for me to pay more. But people preferring YouTube to TV would pay a lot, and thus reduce revenue to business like Apple TV, Google TV, Netflix, etc., but taking away a share of the pie. So for me a consumer, paying on how much traffic I use seems as much legit as big truck paying more than car on a toll road, simply because I use more resource.

This argument so seems biased towards ISPs. What if things like the ones below, said in Wikipedia, happens?

From Wikipedia
Neutrality proponents claim that telecom companies seek to impose a tiered service model in order to control the pipeline and thereby remove competition, create artificial scarcity, and oblige subscribers to buy their otherwise uncompetitive services.

From HuffingtonPost
A fast lane would let some websites operate at higher speeds and essentially relegate many sites — likely smaller, less-moneyed ones — to a slower pace.

Where are the regulatory authorities who enforce mobile communication and telecommunication pricing? Why such an authority can’t be created and clear rules saying what could and couldn’t be charged drafted? Why can’t things like slow lanes be completely outlawed? Say we create a law which says Non-profit sites like Wikipedia, Khan Academy should never be charged (extra) even for streaming content and pricing can only be done on For-profit entities like YouTube and Netflix. Calls for making internet a utility are a step in this direction of regulation. Why would anyone really want to create a slow-lane if more speed and more data transfer results in more money for the utility provider?

So.,

The entire thing about raising hue and cry over differentiated pricing seems to be for only one reason – Greed.
The greed of the companies to make money. They pull us, consumers, into the issue by terrifying us that our bills would shoot sky high and ISPs will fleece consumer. If that is the real issue, I think it is better dealt by talking to (or even creating new) regulatory authorities, consumer protection agencies and other similar governmental organizations and certainly not by tactfully converting a purely capitalistic issue a social issue involving rights and freedom.

I as a layman consumer is completely ok with the current state of affairs and don’t mind if the billing becomes usage based, or someone creates a fast-lane for those who pay more as long as the slow lane is at the mandated minimum speed, which is the present case anyways.