Matplotlib Tutorial: Plotting most popular iPhone applications size over time

Strange things happens after you have updated some applications on your mobile phone. More than often there is not enough space on device, because some applications are more greedy than others. That’s was the case with my iPhone after regular update of most popular applications: I’ve received message about removing something as there is no space left on device (sigh!).

So, there was an initial question: How application size changes over time? Two years ago there was no problems with the same application, but now it eats up twice the same the size!

One of the best way to understand this kind of information is through plotting data showing relation between two variables: application size and date of update release.

Getting data

Here is the problem: There is no history of application releases in iTunes, only the latest one. So, after quick research the Internet Archive seems like a good option to view history of specific web page over time, in our case it would be iTunes web page for specific application: see Facebook application history as example.

Unfortunately, not all history is being saved for specific web page, for example: GMail history is saved only starting from March 25, 2017, but anyway we can get an idea of application size over short period of time.

Getting history for web page is easy using Internet Archive search end point: https://web.archive.org/cdx/search/cdx?. All you need to do is to pass web page URL to this end point and the returned result is a list of items with time stamp and URL of page for that date.

Here is source code for getting history for most popular iPhone applications. The idea is simple: Iterating over predefined list of applications URL; Getting history of URL for specific application; Parsing application size and release date; Saving data to make it readable by plotter.

Plotting with Matplotlib

Matplotlib is one of the most popular 2D plotting library for Python. If you are familiar with MATLAB, there will be no problem using this library. On the other hand there is a set of tutorials from beginner to advanced user.

The source code for plotting relation between release date and size is pretty straightforward: For each application data file new subplot is created and used for plotting dates. All subplots are located in one figure which can be displayed on the screen or saved in different formats. In our case all applications subplots are saved in result image file using savefig.

apps_stats

Here is the source code for Internet Archive parser and Matplotlib plotter with additional parsed data for most popular iPhone applications.