View Full Version : Slate.com
Arnaud
06-23-2004, 09:02 AM
hey guys (again),
i would like to read Salon :p but i would also like to read Slate
there's no "per se" mobile edition, no "pda-friendly" page directly reachable, but there's a text-only page
it's available from here : http://slate.msn.com/id/117499/
and the link is http://slate.msn.com/apps/myslate/action/save.aspx?ids=toc
beware it's a VERY heavy page to load !
as you probably already noticed (and would expect from Microsoft) :p you cannot put the link directly in iSilo, it would not recognize the format and would refuse to convert
once again, i can :
- first, charge the page ; don't be surprise, it has some kind of weird appearance (but turns back into normal html once saved)
- then save it into html
- third, drag it to iSilo and convert it
bummer
do you know a way to, once again, automatize this ?
a simple an efficient URL / suffix of this page, that would allow me to put it and schedule it in iSilo, would be just fine for me :)
thanks !!!
Voltage Spike
06-23-2004, 09:41 AM
The web page reports the MIME type as "text/x-anything". If iSiloX uses this information, it would not be considered HTML. The page is, however, HTML; perhaps iSiloX can/could ignore the MIME type.
The other alternative is converting in two stages: download the web pages using a seperate tool, and then convert the (now) local web pages in iSiloX. (Note: I actally prefer to keep downloading seperate from iSiloX since it allows me to grab only the parts of the web page that have changed.)
Arnaud
06-24-2004, 01:56 AM
i don't quite understand what you advice me to do..
to put it in other words, are you telling me something different from what i already proposed ? :) :)
save first then convert ...
it seems so to me :)
anyway, i'm asking again (a bit differently) :
the page is http://slate.msn.com/toolbar.aspx?id=toc : is there a way to make iSilo understand this kind of URL ?
thanks !
(and thanks Voltage Spike :) )
Voltage Spike
06-24-2004, 09:38 AM
hehe, sorry for the confusion.
In essence, I was saying the same thing. However, my suggestion was leaning more towards the use of a third-party download utility (as opposed to manually saving the web page from within the browser).
Personally, I use a combination of wget and iSiloXC (both command-line tools) to "scoop" web sites. If you are familiar with the CLI, then it is trivial to make this a single step process.
However...
I would like to know if iSiloX has the ability to ignore MIME types. :-)
iSilo
06-24-2004, 10:51 AM
Converting http://slate.msn.com/toolbar.aspx?id=toc with iSiloX works fine. The content type header that I'm getting has the value "text/html; charset=utf-8" and not "text/x-anything". Perhaps they changed it in the interim?
So iSiloX is able to download and convert the page. But there is one issue with the links on the page and that is that the links to the articles on the page have a problem. So for example, they have <a href="#2102798"><b>mixing desk</b></a>, but the anchor of the target of the link is <a name="#2102798">. The '#' should not be there. It should instead be <a name="2102798">. The next iSiloX update will incorporate a change to handle this scenario so that if a target anchor name begins with '#', the '#' will be ignored.
The workaround is to contact the webmaster of the site and have them correct the situation or to download the page yourself and use a text editor to search and replace <a name="# with <a name=" (e.g., remove the '#'), then convert the modified page.
Arnaud
06-25-2004, 07:03 AM
Converting http://slate.msn.com/toolbar.aspx?id=toc with iSiloX works fine.
you're right !
i just checked once again, and it works
(generates a 820 k file though !)
But there is one issue with the links on the page and that is that the links to the articles on the page have a problem. So for example, they have <a href="#2102798"><b>mixing desk</b></a>, but the anchor of the target of the link is <a name="#2102798">. The '#' should not be there. It should instead be <a name="2102798">. The next iSiloX update will incorporate a change to handle this scenario so that if a target anchor name begins with '#', the '#' will be ignored.
The workaround is to contact the webmaster of the site and have them correct the situation or to download the page yourself and use a text editor to search and replace <a name="# with <a name=" (e.g., remove the '#'), then convert the modified page.
ok, complete hebrew for me :)
the thing i note is that it works fine !
thanks again
(could you explain on my other "Salon" post, how to set cookies in iSilo ?
i really don't understand this feature ...)
vBulletin® v3.7.4, Copyright ©2000-2008, Jelsoft Enterprises Ltd.