Categories
meetings

Presentations and hands on, Monday meeting

June 11th 2012. 10am-12.45pm EST
People: Ahmd + Pablo

Check the live notes of the meeting at http://brownbag.me:9001/p/120611pageonex

Presentations

Ahmd @AhmdRefat
Studies, Cairo, elections
Work at Pageonex and studies

Pablo @numeroteca
http://civic.mit.edu
Paper with sahsa montera34.com/personal/pablo/120313_FrontpagevsTwitter.pdf

Work flow:

Technical

Domain.
pageonex.com (is redirected to numeroteca.org) and pageonex.org. I’ve bought them in http://gandi.net

Server space
Hard disk space: make calculation about what we need, stimate aprox size we need.
Gandi.net 1 share provides 3 GB.

1 standar visualization: Normal 30 days * 6 newspapers = 180 images. 180 * 500 = 90.000 KB = 87MB

  • Images png, jpg (each is 500KB),
  • 3 sizes full size, medium format (screen) and thumbnail
  • Different formats of front pages, different sizes on screeen:

Traffic 
Not yet.

Libraries for Ruby

Images handling

Visualization

UI Design
[UI draft https://docs.google.com/presentation/d/1C0XMk14KMNINQFrAnkr6eGFVVSD-2xXuy7lwKRs7jL8/edit]
Bootstrap – http://twitter.github.com/bootstrap/
Mediameter – https://github.com/c4fcm/MediaMeter-Coder

Mediameter
Ahmd: It’s not similar: we are not going to use most of it. Models created, but not all implemented or we are not going to use them https://github.com/c4fcm/MediaMeter-Coder/tree/master/app/models It’ll take more time to start from it. Nathan could build an API for use to use it. Issues about delating the project for using it. Looking for midterm. Prefer to start from scratch: it’d faster.
The web app is the architecture. Ruby: DB, controllers (for every action: creating user, creating threads, creating fornt pages, scrapping from any Media, for UI, for handeling images, displaying).

Pablo: Both projects share the idea: coding articles in news. We should try to work together. I’ll connect with Sasha and Nathan to get feedback on this. My feeling is that is always better to work on existing tools, but also understand the will of “starting from clean” form Ahmd. We should tae a decision soon. About Mediameter: Is in github the last version of the code? it seems that it hasn’s been update recently.

Tasks, step by step

  1. Scrapping
  2. Storing
  3. Analyze / coding
  4. Display / data vis
  5. UI Desing
  6. Legal issues

1. Scrapping
As wide approach as possible.

Ruby scrappers:
-Start with http://en.kiosko.net/
-Newseum http://www.newseum.org/todaysfrontpages/ Difficult code to scrap. Example
In Egypt! http://www.newseum.org/todaysfrontpages/hr.asp?fpVname=EGY_ALT&ref_pge=gal&b_pge=1

Consult with them by email.

-Other newspapers: Check for them! make research at

and other major newspapers
Spain
Egypt
Mexico
US

2. Storing / Data base

DB tables:
User: Id, user-name, email, password, thread
Thread: Id, Thread-name, start-date, end-date, newspaper(s)
Image: Id, type, newspaper_id, date, size
Newspaper: Id, newspaper name, country, city
Highlighted areas: image-id, tag(s), user
Area: area-id, X1, Y1, X2, Y2, highlighted_area_id

Twitter: for later. ToDo. Not yet.

Newspaper: source? wondering about the scalability of the system. Thinking in other sources: magazine, blogs…

3. Analyze / code
Questions: Non rectangular news: http://img.kiosko.net/2012/06/11/us/newyork_times.750.jpg how to id?
Possible solution:
-Multiple area selection
-Select and remove a part of the selection after selecting the main article

For area selection: http://deepliquid.com/content/Jcrop.html or http://odyniec.net/projects/imgareaselect/ Not using pixels, but coordinates.

Low resolution grid, for later. To facilitate intercoder reliability.

4. Display data vis
Check the gigapan, good to navigate.

Html,
svg interactive,…
How to handle huge amount of thumbnails: one matrix picture…

Libraries…

5. UI
–discussed before–

6. Legal issues
Ask Berkman Center
Center for Civic Media

Newseum
Kiosko

———
Ahmd main work: scrappers and model building.

Leave a Reply

Your email address will not be published. Required fields are marked *