Skip Navigation

Site Wide Message: Posted Sep 8, 2010 10:54:36 AM EDT.Dismiss this message
A big thank you to all site users for helping us win the 2010 Inc 5000 award for fastest growing private company, for the fourth year in a row!
Employers and workers: two universities are interested in how and why you use vWorker. (All information will be kept confidential and only reported in aggregate.)
  1. Employers: Louisiana State University wants to learn about the factors that influence your adoption of online sourcing. Click here to take the survey.
  2. Workers: Erasmus University wants to understand how you use the site. Contributors will receive a final copy of the aggregated data Click here to take the survey.

Need HTML spider to catalog and convert video gallery pages into CSV/ZIP
Project Id: 1461244

Bookmark in my 'To Do' list
Posted by: souther (13 ratings)
(Employer rating 10)
Non-action Ratio: Very Good - 0.00%
Employer Security Verifications: Excellent
Approved on: Jul 24, 2010
3:32:13 AM EDT
Bidding Closes: Aug 7, 2010
3:30:49 AM EDT
Viewed (by workers): 783 times
Deadline: Please estimate in your bid
Phase:
Cancelled
Employer cancelled on 8/6/2010 3:37:13 PM because: "".
Reposted as: 1469494
Payment Model: Pay-for-Deliverables
Max Accepted Bid: Bidding is closed
Project Type: Small Business Project: $100(USD) and above
Bidding Type: Open Auction
Accepted Bidder Economy Type(s): All
Accepted English fluency(ies): All
ExpertRating Requirement: None
Shortcuts

Communication

Messages summary
Post first reply
Chat log
During project work

Status reports
Escrow Log
Work acceptance
Assembla Tools
Mediation / Arbitration
Other

Contact info / receipts
Project phase log
Ratings

(Note:options without links are not enabled for this phase.)

Enter chat room for this project
(0 active users at Sep 9, 2010 3:23:33 AM EDT)



Brief Summary:
Web application with simple one-page interface.

Input: A list of URLs of thumbnail/movie gallery HTML pages which have textual, graphical, image, and video content in various formats and layouts.
Output: A CSV dump file which includes the Title, Description, Video thumbnail filename, Video filename (FLV format), Video duration (minutes:seconds), Video height (pixels), Video width (pixels) AND a ZIP file including the video and thumbnail files.

Magic:
1) Spider will need to navigate page and catalog all direct links to video files and video files hosted on the page.
1a) Spider will need to forge referral page information when requesting images and videos to get around webserver leaching restrictions.
2) If the video file is in WMV format, convert it to FLV and save a local copy for inclusion in ZIP file. If the video file is already in FLV format, just save a local copy.
3) If the video file is direct linked from the gallery page, and it is linked from this page via a linked image, save that image as the thumbnail for that video.
4) Determine the height and width of the video file and resize the image thumbnail to the same dimensions.
5) If there is no image thumbnail (see #3), create a thumbnail (with the same height/width as the video; jpeg format) from a random frame of the video file.
6) Determine the duration of the video clip and save this for export in the resulting CSV file
7) Save the title of the gallery page as the the title of the video clip.  If there are multiple videos on a single gallery, append "#1", "#2", etc. to the end of the title.
7a) Rename video file and video thumbnail file to correspond to this same title format (to prevent filename collisions with past or future videos).
8) Catalog all text of at least 2 sentences long for export as description of video clips on page.  
9) Use a predefined list of 'stop' words to disqualify certain sentences from being included in Description collation.
9) Export a CSV file with one line per video clip and all other information described in Output section above. There may be multiple entries if there are multiple videos hosted or linked from the gallery page.  
10) Export a zip file which includes all of the thumbnails and FLV video files.


Requirements Interview Answers:
To help you bid more accurately, the employer was interviewed about the requirements for this project. Below are their answers. Untitled Page
Project Type: What kind of work do you need done?
Software related (Includes desktop applications and internet websites)
Project Parts: What do you want the worker to do on this project?
Requirements: The worker will analyze the problem and propose a software-based solution to the problem.
Programming: The worker will take the requirements and translate them into the language of the computer (and test it).
Req. Doc. Type: What kind of documentation do you want for this project?
Informal documentation - As the employer talks back and forth about the project with the worker, those conversations become the requirements.Remember to communicate ALL of the details of your project on the vWorker.com site. If you don't, and there is a dispute, then important details of the contract will not be documented and cannot be taken into account in arbitration. If you feel you MUST go offsite (for example, using the phone or IM) then afterwards post everything onsite and get the other party to post that they agree to those contractual terms.
Program Type: What kind of software should the worker create (and/or install)?
  • An internet web-site: This software runs on a web server and users will access it using their internet browser.
Internet web-site info
Design and functionality: What does the programming of this project involve?
  • Program Functionality: Making the website "work".
Modeling another site: Do you wish to model another site? No
Size of website: How many pages need to be created/edited in this website?
Approximately 1.
Programming Language: What programming language(s) do you want your website written in?
I do know the language(s).
Languages(s):
  • Flash
  • PHP
  • XML / XHTML
Database: Will this project include a database?
No, it does not include a database.
Browser Type(s)/Version(s): Which browser/version combinations must this website support?
  • IE 8.0
  • Firefox 3.6
  • Safari 41
Server Hosting Environment: What is your server hosting environment?
I have my own in-house server.
Server Hosting Environment: Will the worker develop "live" on your server?
No. The worker is responsible for creating their own development and/or qc environment.
Legal: 1) I require complete and fully-functional working program(s) in executable form as well as complete source code of all work done (so that I may modify it in the future).
2) Deliverables must be in ready-to-run condition as follows (depending on the nature of the deliverables):
2a) All other software (including but not limited to any desktop software or software the employer intends to distribute) must include a software installation package that will install the software in ready-to-run condition on the platform(s) specified in this project (unless specified elsewhere by the Employer).
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Employer will receive exclusive and complete copyrights to all work purchased.
3b) No part of the deliverable may contain any copyright restricted 3rd party components (including GPL, GNU, Copyleft, etc.) unless all copyright ramifications are explained AND AGREED TO by the employer on the site per the worker's Worker Legal Agreement.
Other Requirements:
You will be required to develop and test this software with pages which contain adult content. You must accept this provision to be selected for this project.

Example gallery pages will be provided to use during development and test phases.
  • All deliverables must be uploaded to vWorker.com before the deadline(s) for this project...with no exceptions. If this contract makes it impossible for a competent person to do this, then do not start this project...but instead alert vWorker.com of an un-arbitratable, illegal project.
  • Remember that contacting the other party outside of the site (by email, phone, etc.) on all business projects < $500 (before the employer's money is escrowed) is a violation of both the employer and worker agreements. vWorker.com monitors all site activity for such violations and can instantly expel transgressors on the spot, so we thank you in advance for your cooperation. If you notice a violation please help out the site and report it. Thanks for your help.
Categories:
(Note: Like everything else on this page, these categories are part of the original contract for this project.)
Web development, Requirements, UNIX, Other (Technology), Web services, Linux, FreeBSD, XML / XHTML, Technology, Web programming, Tech details



 
Messages summary
( Back to shortcuts )
All monetary amounts on the site are in United States dollars.
vWorker.com is a closed auction, so workers can only see their own bids and comments. Employers can view every posting made on their projects.

Bidding Closes At: Aug 7, 2010 3:30:49 AM EDT
  Max accepted bid: Open to fair suggestions
5 bids have been posted. Why can't I view bids from other workers who are bidding against me?
Bidding/comment cannot be viewed until you are logged in.

No bidding allowed, because this project was cancelled.

Cancelled Date: 8/6/2010 3:37:13 PM
Cancelled Reason:
Cancelled By: Person id: 6893374
e
 

 
Ratings
( Back to shortcuts )

Rating Rated Rated For Rated By Rated On  
  3
(Poor)
mz412 Need HTML spider to catalog and convert video gallery pages into CSV/ZIP Michele Nisi (vWorker) August 6, 2010 3:37:06 PM EDT
  (Arbitration/mediation result):
Mz412 (the worker) chose to pull out of the project and break their contract. All funds in escrow were returned to souther (the employer).

  None Given souther Need HTML spider to catalog and convert video gallery pages into CSV/ZIP Michele Nisi (vWorker) August 6, 2010 3:37:06 PM EDT
  (Arbitration/mediation result):
Mz412 (the worker) chose to pull out of the project and break their contract. All funds in escrow were returned to souther (the employer).

  None Given mz412 Need HTML spider to catalog and convert video gallery pages into CSV/ZIP souther
(who themselves is rated 10 - Excellent)
August 6, 2010 3:37:06 PM EDT
 

  None Given souther Need HTML spider to catalog and convert video gallery pages into CSV/ZIP mz412
(who themselves is rated 8.43 - Very Good)
August 6, 2010 3:37:06 PM EDT