< Data Platform < Data Lake < Traffic

Data Platform/Data Lake/Traffic/Pagecounts-ez

This dataset is described on its dumps download page.

This dataset is a compressed format of the best pageview data that the Wikimedia Foundation had at any point in its historyː

  • From 2007 to December 2015, it compressed the pagecounts-raw dataset, which is now deprecated (providing pageviews per project from December 2007 on, and pageviews per article from late 2011 on)
  • From Dec 2015 to Present day, it compresses the pageviews dataset

More information about each of those datasets can be found on their pages.

One hour skewing issue

The data on this dataset, when compared to the canonical Pageviews API, is skewed one hour to the left. This means that on Pagecounts-EZ reports as midnight the pagecount value that in reality corresponds to 11PM the previous day:

Dataset12am1am2am3am4am5am6am7am8am9am10am11am12pm
Pageview API2323443345641253465443645986575
Pagecounts EZ8923234433456412534654436459865

See also

This article is issued from Wikimedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.