steptail.com

If I try to fail, but succeed, which one did I do?

User Tools

Site Tools


retroweb:contribs-howto

This is an old revision of the document!


Contributing Content to RetroWeb

Join our discord channel, or Facebook group RetroWeb User Group. Ask one of the admins to help you get started.

Quick Access

You Will Need

  • Knowledge of HTML code
  • Be able to run more than one web browser on your main computer specifically for testing
  • Various computers and web browsers to test
  • PuTTY

Getting started

  1. You need two logins. One for the Contributor Panel and one for SSH. Contact Admins to get them. Tell the Admins your preferred username.
  2. Install and run the latest version of PuTTY
  3. In PuTTY, under Host Name type in sirius.steptail.com and under Port 2269
  4. On the navigation bar to the left, change to SSH and select the subcategory Tunnels.
  5. Under Source Port, type in 8080. Under Destination, type 10.0.16.62:8080. Make sure the option Local is selected, then click Add
  6. On the navigation bar to the left, go back to the Session at the very top.
  7. You may save your settings now by typing in a name under Saved Sessions and clicking Save.
  8. Click Open to login. Type in your SSH username and password given to you in the earlier part of this guide.
  9. Once logged in, you will need to set your web browser proxy parameters as follows:
    1. HTTP: localhost, Port 8080
    2. FTP: localhost, Port 8080

Contribution Workflow

The standard workflow is as follows:

  1. Decide which web site you'd like to archive. It should be a site that first appeared in the 90's or latest in the year 2001.
  2. First open archive.org track down a good copy of a website, notate their date when they were archived on archive.org. Browse the date of the website until you see least broken links and images. This date will be your target date for the archiver.
  3. Open the archiver, and start an archiving job for the domain and the target date. Select a link depth that is realistic for the website. If it's a large website, it could take days to complete with depth=5 or depth=6.
  4. Once complete, it will be editable, and can be previewed with the development proxy server. The development proxy server is access restricted, and you will need to give me your IP address to allow access to it. Otherwise it works just like the production server.
  5. You will want to fix any broken links, and images. Once you're done, you can hit publish. The website will be marked published, and be viewable in the production server within 24 hours.
  6. Also, if the archive job failed for whatever reason, feel free to delete it and start over.

Questions Answered

Q: What if a recovered site is broken?
A: This depends on a few factors. You can fix some problems on a site manually, and missing graphics can be reconstructed. Sometimes you may find an alternative resource on archive.org, in which case you can use the Upload URL feature in the File Manager. Your main focus should be on the start page (index.html), and at least one sub page. If the site is too broken, and you cannot reconstruct the start page, we recommend you delete the site and find an alternative date with less broken links or images.

Q: How do I know if my job is failed?
A: You can view running jobs in the control panel, and looking at your jobs logs will usually indicate a failure. or if you notice your website started archiving, but you put in the wrong target date, you can delete a running job

Q: In the logs, it looks like archiving has slowed down. Is it stuck?
A: Toward the end of an archival process, the archiver goes through a lot of files trying to find any files that it may have missed. This is probably what you are seeing. Give it time - it will complete.

Q: I cannot access the web site I just crawled on the development server. The job is Complete, but I'm always getting a 500 server error!
A: The development server is exact when it comes to the URL. “www.site.com” is different than “site.com”. So make sure you are accessing the site with the URL you crawled. If you crawled “www.site.com”, then you will access the site as “www.site.com”. If you crawled “site.com”, then you will access the site with “site.com”. Only after publishing, the redirects will be added, so “site.com” will go to the primary site “www.site.com” and vice versa.

Q: I would like to back up the site I crawled. Is that possible?
A: Yes, you can always back up your site, even after you fix and edit it. Just log on to the Contributor Panel, go to the File Manager of the your website using the Edit button, select all files, and click on the ZIP or TAR buttons to create archives of the selected files. You can then download the archive to your computer.


Questions? Comments?

retroweb/contribs-howto.1599863602.txt.gz · Last modified: 2020-09-11 22:33 by omolini