We know that deploying PageOneX on your own server or in Heroku is not an easy task. Before we deploy it in its final destination, pageonex.com, we want to hear from you
- Second time I run the app I get: “An error occured while installing factory_girl (3.0.0), and Bundler cannot continue.”
- Listing of all the front pages images in display view. Fixed. Just commited.
- Sometimes the scraper fails. Limitation of dates? For example: October 2011 fails…. fixed. Line 0 of lib/scraper.rb remove the 0, it was causing problems for months 10, 11 and 12.
- Add the limit date for kiosko when creating a thread. Is it different for dif newspapers? I think so. It would be great to hae a messsage like: “x images from x newspapers have not been found.” We should build and scraper (for future updates) to detect when a newspaper got in kiosko.net)
- Edit: thread features (see highlighted areas in the coding view once you come back from display). Which thing will not be edtitable? Dates and media could not be changed, so the scraper doesn’t run again.
- adding the “nothing to be coded” button + display images that are coded different from the ones that are not coded. https://github.com/numeroteca/pageonex/issues/12 In the coding view, once you press the buton it will appear a div with “this image has been coded” and the image will me with opacity 0.6.
- Edit: be able to view and edit areas. https://github.com/numeroteca/pageonex/issues/7
- Data associated with image: Show real name, not url name, of newspaper. + Real links to source of image, newspaper website (we might need to re-run the https://gist.github.com/2970558 and add the url of the newspaper), date (mmm. where this would link to?) https://github.com/numeroteca/pageonex/issues/33
- Question: when coding large ammount of images, it is difficult to know where we are? which order the newspaper would appear? Order by date and not by newspaper? Let’s try to order by date.
- Newspaper by row https://github.com/numeroteca/pageonex/issues/31 Newspaper name in the first column.
- Creating thumbnails for front pages https://github.com/numeroteca/pageonex/issues/22 and resize those thumbnails, not the full size pages.
- Add link when you click on an image, so you can re-edit it.(recode images)
- Add dates to have a reference (for each column of images shows the date of them)
- Quantification of highlighted areas: bar graph.
- Colors of codes in display view.
- Online test? We need an online version for beta testers. Which are our need in term of server, domain… so I can prepare. We can use http://www.heroku.com/ for first test, and then start building it in our own server.
- Compatibility with other browsers: (bootstrap itself provide this feature ) http://twitter.github.com/bootstrap/
- Open/close feature
- Select media sources (frontpages) from different moths. Now we can only select days within a month.
- Be able to select scraper source:
- [the other built in our scrape.rb El País, NYT
- While scraping: show files that are being downloaded/failing
- Show which threads are opened (all threads) and be able to search.
- Export graph and data
- Select / unselect newspapers
- Select order in which newspaper appear.
- Question: how non-coder will be view the display? any diference in the links to coded image?
- Why jquery carrousel vs. single view.
- How highlighted ares are handled, gem used? storing coordinates? storing width-height?
And keep going with other issues to explain different decisions in the development process.
User interface drafts from some weeks ago
First previews form the Ruby On Rails application:
Rporres was visiting Cambridge last week, and after listening to one of the online meetings, he decided to jump in the project. By night he had finished the script for grabbing all the newspapers that are available in Kiosko.net. We will need this list soon
You can check the code at https://gist.github.com/2970558 or the output (a csv file with the name, friendly url name, coauntry and country code of all the 377 newspapers) at http://brownbag.me:9001/p/pageonex-kiosko-newspaper-names. It’s written in Perl.
Ahmd had started another similar one in Ruby, but we put it on hold for the short term.
Thanks Rafa for your help!
Ahmd has been working on a scrapper in Ruby for the front Pages at Kiosko.net
I’ve finished the scraping script, and it’s public on https://gist.github.com/2925910 to run the script just pass the file to ruby [ruby scraper.rb] and it will generate the folders (the directories is set for Linux, if you are on Windows you should modify them first), download the images(you can change the variable values in the get_issues method to get different newspapers), and write the log to stdout.
Check the script below.
I’ve also contacted Newseum to see if their “only today” front page data base is avaible for PageOneX.
Script to scrape front pages images of newspapers form kiosko.net
Crossposting from numeroteca.org.
View this datavis full size at gigapan.
Today’s post is to present the tool we are building this summer: PageOneX. The idea behind is to make online and easier the coding process of front page newspapers. Make this visualization process available for researchers, advocacy groups and anyone interested. I’ll will give some background about this process.
How things started
Approximately one year ago I started diving in the front page world. It was days after the occupations of squares in many cities from Spain, and I was living in Boston. I made a front page visualization to show what people was talking about: the blackout in the media about the indignados #15M movement. You can read more about Cthe story in the ivic Media blog. Since then I’ve been making more visualizations around front pages of paper newspapers, testing different methods and possible ways to use them. I’ve also made a tool, built in Processing, to scrap front pages from kiosko.net and build a .svg matrix.
- Gallery of different twitter-newspaper visualizations. http://numeroteca.org/cat/frontpage-newspaper/
- Post: Analyzing newspapers’ front pages to interpret the Mainstream Media ecology.
- Presentation: Approach to a User interface. http://www.slideshare.net/numeroteca/arab-spring-spanish-recolution-and-occupy-movement-mainstream-media-vs-social-media-coverage and more presentations at http://www.slideshare.net/numeroteca
- Code for the semi-automated process built in Processing: https://github.com/numeroteca/pageonex-processing
I’ve met with Nathan Mathias, who is developing Media Meter http://mediameter.org/ (development version http://mmdev.media.mit.edu/, please do not share the link yet), a tool built on Ruby on Rails to crowdsource the analysis of news, and able to test the intercoder reliability. They are offering their code (the github link will come soon). We still have to figure out if we go for it, and if we do how the collaboration will be: fork it or stay in the same platform. Nathan, the main developer, is offering online support and to be on the conference calls once a week. He is willing to expand the use of the tool for more uses.
I think it is a great idea to not start from scratch and to have Nathan’s help. At the same time I am worried about starting our project with many things that we do not need. Any thoughts on this?