Author Topic: Programatically download the latest 64 bit version of Firefox for Windows.  (Read 6768 times)

Offline AIR

  • BASIC Developer
  • Posts: 628
New Day, New Code Challenge.

Download the latest 64 bit version of Firefox for Windows.

Requirements:

  • Connect to "https://www.mozilla.org/en-US/firefox/new"
  • Parse the html to extract the current version number
  • Parse the html to extract the download link to the FULL version (there is a "stub" version, we don't want that)
  • Clear the screen
  • Show a message with the version of Firefox being downloaded.
  • Show the download progress as it's happening
  • Show a complete message.

The progress should be displayed in the following format:
Quote
    Downloading Latest Firefox (version number goes here)
    Downloaded 34140000 of 44400072 at 11784kb/s

    That is current data downloaded in real time, the total size, and the internet speed each in kb.

    The download progress should display on a single line, updating that line as the download progresses.


I'll be posting a submission using Nim this evening and leave the other languages to whoever wants to submit something....

AIR.

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
This look like a natural for SB and cURL.

Offline AIR

  • BASIC Developer
  • Posts: 628
NIM Submission, works on macOS and Linux (on a Raspberry Pi)

to compile:  nim c -d:ssl getFirefox.nim

Code: [Select]
Firefox Download Challenge (Nim Version) by AIR.

Downloading Latest 64Bit Firefox (63.0) for Windows.

Downloaded 42.343MiB of 42.343MiB

Download Complete.

Code: Text
  1. import httpclient, htmlparser,xmlparser, xmltree, streams, ospaths, asyncdispatch, strutils
  2.  
  3. const
  4.     ClrLine = "\x1b[0K\r"
  5. var
  6.     totalsize:BiggestInt
  7.  
  8. proc ClrScr() =
  9.     stdout.write("\x1b[2J\x1b[H")
  10.    
  11. proc showProgress(total, progress, speed: BiggestInt) {.async.} =
  12.     totalsize = total
  13.  
  14.     stdout.write(ClrLine, "Downloading ", progress.formatSize, " of ", total.formatSize, " at ", (speed div 1000).formatSize(','))
  15.  
  16.     flushFile(stdout)
  17.  
  18. proc download(url,filename:string) {.async.} =
  19.     var client = newAsyncHttpClient()
  20.  
  21.     client.onProgressChanged = showProgress
  22.  
  23.     await client.downloadFile(url, filename)
  24.  
  25.     stdout.write(ClrLine, "Downloaded ", totalsize.formatSize, " of ", totalsize.formatSize," ".repeat(20))
  26.  
  27.     flushFile(stdout)
  28.  
  29.     echo "\n\nDownload Complete.\n"
  30.  
  31. proc main() =
  32.     var
  33.         client = newHttpClient()
  34.         src = client.getContent("https://www.mozilla.org/en-US/firefox/new")
  35.         c = parseHtml( newStringStream(src) )
  36.         url: string
  37.         version: string
  38.  
  39.  
  40.     for html in c.findAll("html"):
  41.         version = html.attr("data-latest-firefox")
  42.         break
  43.  
  44.     for d in c.findAll("li"):
  45.         if d.attr("class") == "os_win64":
  46.             for e in d.findAll("a"):
  47.                 if e.attr("class") == "download-link":
  48.                     url = e.attr("href")
  49.                     break
  50.  
  51.     ClrScr()
  52.  
  53.     echo "Firefox Download Challenge (Nim Version) by AIR.\n"
  54.     echo "Downloading Latest 64Bit Firefox (",version,") for Windows.\n"
  55.  
  56.     waitFor url.download("Firefox Setup " & version & ".exe")
  57.  
  58. main()
  59.  


This would have been so much simpler/clearer if Nim supported xpath syntax, but there ya go....


AIR.
« Last Edit: October 26, 2018, 06:16:12 PM by AIR »

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Here is my first pass at the challenge. I can download the file fine but the cURL progress meter feature doesn't seem to be working for me yet. I may have to resort to the static INFO funtions to build your download status string.

@AIR - When you said ignore the stub URL, did you mean the download file URL it returns or did I get the right link?
Code: Script BASIC
  1. INCLUDE curl.bas
  2.  
  3. ch = curl::init()
  4. curl::option(ch,"URL","https://www.mozilla.org/en-US/firefox/new/")
  5. wp = curl::perform(ch)
  6.  
  7. IF wp LIKE "*data-latest-firefox=\"*\" data-esr-versions*" THEN PRINT JOKER(2),"\n"
  8.  
  9. IF wp LIKE """*<div id="other-platforms">*<li class="os_win64">*<a href="*"*""" THEN
  10.   curl::option(ch,"URL", JOKER(4))
  11. END IF
  12.  
  13. dl_html = curl::perform(ch)
  14. IF dl_html LIKE """*href="*"*""" THEN dl_file = JOKER(2)
  15.  
  16. curl::option(ch,"URL", dl_file)
  17. curl::option(ch,"FILE","Firefox_Win64.exe")
  18. curl::perform(ch)
  19.  
  20. curl::finish(ch)
  21.  


jrs@jrs-laptop:~/sb/abcc$ time scriba ffdl.sb
63.0

real   0m2.422s
user   0m1.108s
sys   0m0.291s

jrs@jrs-laptop:~/sb/abcc$ ls -l Firefox_Win64.exe
-rw-r--r-- 1 jrs jrs 44400072 Oct 27 01:02 Firefox_Win64.exe
jrs@jrs-laptop:~/sb/abcc$

« Last Edit: October 27, 2018, 09:43:13 AM by John »

Offline AIR

  • BASIC Developer
  • Posts: 628

@AIR - When you said ignore the stub URL, did you mean the download file URL it returns or did I get the right link?


The stub version is a small program you download that will then download and install Firefox.

What we're after is the full installer, which once you have it doesn't require an internet connection for Firefox to be installed on a system.

Quote from: John
I may have to resort to the static INFO funtions to build your download status string.

I think you're gonna have to update the Curl module, because according to the old module docs CURLOPT_PROGRESSFUNCTION is not implemented:

Quote from: ScriptBasic Documentation
CURLOPT_PROGRESSFUNCTION is used in conjunction with CURLOPT_PROGRESSDATA to specify a progress function that CURLIB library calls from time to time to allow progress indication for the user. This is not implemented in the ScriptBasic interface.

Which means that you won't get real time stats while the download is happening with the Curl module as it currently exists...


AIR.
« Last Edit: October 27, 2018, 10:34:12 AM by AIR »

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Based on your feedback, I have the right installer file and I'm stuck with after the fact download stats without enhancing the libcurl extension module.

On a positive note, a 2.4 second delay before completion and results isn't eternity.

progressfunc.c
Code: C
  1. /***************************************************************************
  2.  *                                  _   _ ____  _
  3.  *  Project                     ___| | | |  _ \| |
  4.  *                             / __| | | | |_) | |
  5.  *                            | (__| |_| |  _ <| |___
  6.  *                             \___|\___/|_| \_\_____|
  7.  *
  8.  * Copyright (C) 1998 - 2018, Daniel Stenberg, <daniel@haxx.se>, et al.
  9.  *
  10.  * This software is licensed as described in the file COPYING, which
  11.  * you should have received as part of this distribution. The terms
  12.  * are also available at https://curl.haxx.se/docs/copyright.html.
  13.  *
  14.  * You may opt to use, copy, modify, merge, publish, distribute and/or sell
  15.  * copies of the Software, and permit persons to whom the Software is
  16.  * furnished to do so, under the terms of the COPYING file.
  17.  *
  18.  * This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY
  19.  * KIND, either express or implied.
  20.  *
  21.  ***************************************************************************/
  22. /* <DESC>
  23.  * Use the progress callbacks, old and/or new one depending on available
  24.  * libcurl version.
  25.  * </DESC>
  26.  */
  27. #include <stdio.h>
  28. #include <curl/curl.h>
  29.  
  30. #if LIBCURL_VERSION_NUM >= 0x073d00
  31. /* In libcurl 7.61.0, support was added for extracting the time in plain
  32.    microseconds. Older libcurl versions are stuck in using 'double' for this
  33.    information so we complicate this example a bit by supporting either
  34.    approach. */
  35. #define TIME_IN_US 1  
  36. #define TIMETYPE curl_off_t
  37. #define TIMEOPT CURLINFO_TOTAL_TIME_T
  38. #define MINIMAL_PROGRESS_FUNCTIONALITY_INTERVAL     3000000
  39. #else
  40. #define TIMETYPE double
  41. #define TIMEOPT CURLINFO_TOTAL_TIME
  42. #define MINIMAL_PROGRESS_FUNCTIONALITY_INTERVAL     3
  43. #endif
  44.  
  45. #define STOP_DOWNLOAD_AFTER_THIS_MANY_BYTES         6000
  46.  
  47. struct myprogress {
  48.   TIMETYPE lastruntime; /* type depends on version, see above */
  49.   CURL *curl;
  50. };
  51.  
  52. /* this is how the CURLOPT_XFERINFOFUNCTION callback works */
  53. static int xferinfo(void *p,
  54.                     curl_off_t dltotal, curl_off_t dlnow,
  55.                     curl_off_t ultotal, curl_off_t ulnow)
  56. {
  57.   struct myprogress *myp = (struct myprogress *)p;
  58.   CURL *curl = myp->curl;
  59.   TIMETYPE curtime = 0;
  60.  
  61.   curl_easy_getinfo(curl, TIMEOPT, &curtime);
  62.  
  63.   /* under certain circumstances it may be desirable for certain functionality
  64.      to only run every N seconds, in order to do this the transaction time can
  65.      be used */
  66.   if((curtime - myp->lastruntime) >= MINIMAL_PROGRESS_FUNCTIONALITY_INTERVAL) {
  67.     myp->lastruntime = curtime;
  68. #ifdef TIME_IN_US
  69.     fprintf(stderr, "TOTAL TIME: %" CURL_FORMAT_CURL_OFF_T ".%06ld\r\n",
  70.             (curtime / 1000000), (long)(curtime % 1000000));
  71. #else
  72.     fprintf(stderr, "TOTAL TIME: %f \r\n", curtime);
  73. #endif
  74.   }
  75.  
  76.   fprintf(stderr, "UP: %" CURL_FORMAT_CURL_OFF_T " of %" CURL_FORMAT_CURL_OFF_T
  77.           "  DOWN: %" CURL_FORMAT_CURL_OFF_T " of %" CURL_FORMAT_CURL_OFF_T
  78.           "\r\n",
  79.           ulnow, ultotal, dlnow, dltotal);
  80.  
  81.   if(dlnow > STOP_DOWNLOAD_AFTER_THIS_MANY_BYTES)
  82.     return 1;
  83.   return 0;
  84. }
  85.  
  86. #if LIBCURL_VERSION_NUM < 0x072000
  87. /* for libcurl older than 7.32.0 (CURLOPT_PROGRESSFUNCTION) */
  88. static int older_progress(void *p,
  89.                           double dltotal, double dlnow,
  90.                           double ultotal, double ulnow)
  91. {
  92.   return xferinfo(p,
  93.                   (curl_off_t)dltotal,
  94.                   (curl_off_t)dlnow,
  95.                   (curl_off_t)ultotal,
  96.                   (curl_off_t)ulnow);
  97. }
  98. #endif
  99.  
  100. int main(void)
  101. {
  102.   CURL *curl;
  103.   CURLcode res = CURLE_OK;
  104.   struct myprogress prog;
  105.  
  106.   curl = curl_easy_init();
  107.   if(curl) {
  108.     prog.lastruntime = 0;
  109.     prog.curl = curl;
  110.  
  111.     curl_easy_setopt(curl, CURLOPT_URL, "https://example.com/");
  112.  
  113. #if LIBCURL_VERSION_NUM >= 0x072000
  114.     /* xferinfo was introduced in 7.32.0, no earlier libcurl versions will
  115.        compile as they won't have the symbols around.
  116.  
  117.        If built with a newer libcurl, but running with an older libcurl:
  118.        curl_easy_setopt() will fail in run-time trying to set the new
  119.        callback, making the older callback get used.
  120.  
  121.        New libcurls will prefer the new callback and instead use that one even
  122.        if both callbacks are set. */
  123.  
  124.     curl_easy_setopt(curl, CURLOPT_XFERINFOFUNCTION, xferinfo);
  125.     /* pass the struct pointer into the xferinfo function, note that this is
  126.        an alias to CURLOPT_PROGRESSDATA */
  127.     curl_easy_setopt(curl, CURLOPT_XFERINFODATA, &prog);
  128. #else
  129.     curl_easy_setopt(curl, CURLOPT_PROGRESSFUNCTION, older_progress);
  130.     /* pass the struct pointer into the progress function */
  131.     curl_easy_setopt(curl, CURLOPT_PROGRESSDATA, &prog);
  132. #endif
  133.  
  134.     curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0L);
  135.     res = curl_easy_perform(curl);
  136.  
  137.     if(res != CURLE_OK)
  138.       fprintf(stderr, "%s\n", curl_easy_strerror(res));
  139.  
  140.     /* always cleanup */
  141.     curl_easy_cleanup(curl);
  142.   }
  143.   return (int)res;
  144. }
  145.  
« Last Edit: October 27, 2018, 12:16:41 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Script BASIC

Code: Script BASIC
  1. ' Firefox Download Challenge (Script BASIC Version) by JRS.
  2.  
  3. INCLUDE curl.bas
  4.  
  5. ch = curl::init()
  6. curl::option(ch,"URL","https://www.mozilla.org/en-US/firefox/new/")
  7. wp = curl::perform(ch)
  8.  
  9. IF wp LIKE "*data-latest-firefox=\"*\" data-esr-versions*" THEN
  10.   version = JOKER(2)
  11.   PRINT "Downloading Latest 64Bit Firefox (",version,") for Windows.\n"
  12. END IF
  13.  
  14. IF wp LIKE """*<div id="other-platforms">*<li class="os_win64">*<a href="*"*""" THEN
  15.   curl::option(ch,"URL", JOKER(4))
  16. END IF
  17.  
  18. dl_html = curl::perform(ch)
  19. IF dl_html LIKE """*href="*"*""" THEN dl_file = JOKER(2)
  20.  
  21. curl::option(ch,"URL", dl_file)
  22. curl::option(ch,"FILE","Firefox_Setup-" & version & ".exe")
  23. curl::option(ch,"NOPROGRESS",0)
  24. curl::perform(ch)
  25.  
  26. PRINTNL
  27. PRINT "Firefox_Setup-" & version & ".exe Downloaded ",FORMAT("%~##,###,###~ Bytes",curl::info(ch,"SIZE_DOWNLOAD")), _
  28.   " at ",FORMAT("%~##,###,###~ Bytes/Second",curl::info(ch,"SPEED_DOWNLOAD")),".\n"
  29.  
  30. curl::finish(ch)
  31.  


jrs@jrs-laptop:~/sb/abcc$ time scriba ffdl.sb
Downloading Latest 64Bit Firefox (63.0) for Windows.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 42.3M  100 42.3M    0     0  24.9M      0  0:00:01  0:00:01 --:--:-- 27.7M

Firefox_Setup-63.0.exe Downloaded 44,400,072 Bytes at 26,133,061 Bytes/Second.

real   0m2.863s
user   0m1.213s
sys   0m0.276s
jrs@jrs-laptop:~/sb/abcc$

« Last Edit: October 28, 2018, 12:40:41 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Curious how big that Nim executable is with all those dependency requirements?

Code: [Select]
import httpclient, htmlparser,xmlparser, xmltree, streams, ospaths, asyncdispatch, strutils
« Last Edit: October 27, 2018, 02:08:09 PM by John »

Offline AIR

  • BASIC Developer
  • Posts: 628
That's because the core Nim compiler is designed to be fairly lean.

Similar to SB/Python/Ruby/FreePascal/Delphi/Xojo(RealBasic)/Powershell/Lua/ZShell/C#, it has a concept of a "Modules" system whereby additional functionality can be added to a given programming language without having it all baked in, or having to recompile the core to add that functionality.

Not having all functionality baked in also allows one to alter a given module in order to achieve a desired result.  SB's Curl module is a perfect example, where it doesn't currently support Curl Callbacks, but that feature can be added without messing with any of the core stuff.

AIR.

Offline AIR

  • BASIC Developer
  • Posts: 628
I was curious how difficult this would be in POWERSHELL, so here's my implementation:

Code: PowerShell
  1. Import-Module BitsTransfer
  2.  
  3. Clear-Host
  4.  
  5. echo "Firefox Download Challenge (Powershell Version) by AIR.`r`n"
  6.  
  7. $progressPreference = 'silentlyContinue'
  8. $result = Invoke-WebRequest "https://www.mozilla.org/en-US/firefox/new/"
  9. $progressPreference = 'Continue'
  10.  
  11. $FF_Version = ($result | %{ $_ -match 'data-latest-firefox="(.+?)"'}) | %{$Matches[1]}
  12.  
  13. echo "Downloading Latest 64Bit Firefox ($FF_Version) for Windows.`r`n"
  14.  
  15. $HTML = $result.ParsedHtml.getElementsByTagName('li')
  16.  
  17. $hits = $HTML | where {$_.outerHTML -match "class=os_win64"}
  18.  
  19. $download_link = ($hits | where {$_.outerTEXT -match 'Windows 64-bit'}).innerHTML | %{[regex]::matches($_,'(?<=\").+?(?=\")')[0].value}
  20.  
  21. Start-BitsTransfer -Source $download_link -Destination "Firefox Setup $FF_VERSION.exe"
  22.  
  23. echo "Download Complete.`r`n"
  24.  
  25.  

Note that the "BitsTransfer" module provides a console-based progress bar, so technically this doesn't meet the full requirements of the challenge.  But it was fun to code!

AIR.

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Re: Programatically download the latest 64 bit version of Firefox for Windows.
« Reply #10 on: October 27, 2018, 05:43:26 PM »
Are you able to provide execution times for the submissions? Seeing the output would also be nice.

Offline AIR

  • BASIC Developer
  • Posts: 628
Re: Programatically download the latest 64 bit version of Firefox for Windows.
« Reply #11 on: October 27, 2018, 06:52:03 PM »
Execution times in this case would be very dependent on your Internet speeds.

For example, my speed is rated at 960+mb/s (roughly 120 MB/s), so the download happens in about a half second for me.  That's a wired connection, WiFi would always be slower.

Also, it would be dependent on how your connection to the website is routed.  I might go through fast routers, and you might go through one that is having packet issues resulting in retries/performance loss.

Besides, this challenge isn't about how FAST you can do it, but HOW you would do it....that's going to be the case with any challenges I throw out there... 8)

AIR.


Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Re: Programatically download the latest 64 bit version of Firefox for Windows.
« Reply #12 on: October 27, 2018, 06:59:36 PM »
Maybe posting a poll at the end of a challenge so those lurking can vote on what they think is the best solution.

Offline AIR

  • BASIC Developer
  • Posts: 628
Re: Programatically download the latest 64 bit version of Firefox for Windows.
« Reply #13 on: October 27, 2018, 09:24:34 PM »
Firefox Download Challenge (NIM Version) by AIR


Updated code, removed XML-based parsing and use REGEX instead...

Code: Text
  1. import httpclient,re,asyncdispatch,strutils
  2.  
  3.  
  4.  
  5. const
  6.     ClrLine = "\x1b[0K\r"
  7. var
  8.     totalsize:BiggestInt
  9.  
  10. proc ClrScr() =
  11.     stdout.write("\x1b[2J\x1b[H")
  12.  
  13. proc showProgress(total, progress, speed: BiggestInt) {.async.} =
  14.     totalsize = total
  15.     stdout.write(ClrLine, "Downloading ", progress.formatSize, " of ", total.formatSize, " at ", (speed div 1000).formatSize(','))
  16.     flushFile(stdout)
  17.  
  18. proc download(url,filename:string) {.async.} =
  19.     var client = newAsyncHttpClient()
  20.     client.onProgressChanged = showProgress
  21.     await client.downloadFile(url, filename)
  22.     stdout.write(ClrLine, "Downloaded ", totalsize.formatSize, " of ", totalsize.formatSize," ".repeat(20))
  23.     flushFile(stdout)
  24.     echo "\n\nDownload Complete.\n"
  25.    
  26. proc main() =
  27.     var
  28.         client = newHttpClient()
  29.         src = client.getContent("https://www.mozilla.org/en-US/firefox/new")
  30.         url,version: string
  31.         x:int
  32.         matches: array[2,string]
  33.  
  34.     x = src.find(re"data-latest-firefox=.(\d+.\d+).",matches)
  35.     version = matches[0]
  36.  
  37.     x = src.find(re"(http.+product=firefox-latest-ssl.+os=win64.+en-US)",matches)
  38.     url = matches[0]
  39.  
  40.     ClrScr()
  41.     echo "Firefox Download Challenge (Nim Version) by AIR.\n"
  42.     echo "Downloading Latest 64Bit Firefox (",version,") for Windows.\n"
  43.  
  44.     waitFor url.download("Firefox Setup " & version & ".exe")
  45.  
  46.  
  47. main()
  48.  

C'mon , there's gotta be SOMEONE who would like to give this a try in the language of choice!!  (Tomaaz, I know you're out there!!!  Post it on RetroB if you want, I'm interested in seeing how you do this!)

AIR.

Offline John

  • Forum Support / SB Dev
  • Posts: 2622
    • ScriptBasic Open Source Project
Re: Programatically download the latest 64 bit version of Firefox for Windows.
« Reply #14 on: October 28, 2018, 12:34:41 AM »
I bet a code challenge to make REGEX emulate LIKE would be fun. I have avoided regex like the plague. It's time I learn this commonly used function. (so f*cking cryptic)
« Last Edit: October 28, 2018, 12:53:42 AM by John »