Author Topic: LIKE +  (Read 1008 times)

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #30 on: November 05, 2018, 11:35:33 AM »
I think I should be effective with regex before trying another cryptic tool like 8th.

Offline AIR

  • RETIRED
  • BASIC Developer
  • Posts: 284
Re: LIKE +
« Reply #31 on: November 05, 2018, 01:35:40 PM »
Well, it's not so much about if the language can do it, but about how would you implement it.

Python makes it simple too....

AIR.

Offline AIR

  • RETIRED
  • BASIC Developer
  • Posts: 284
Re: LIKE +
« Reply #32 on: November 05, 2018, 01:49:49 PM »
I think I should be effective with regex before trying another cryptic tool like 8th.

Regex is one of those things that will drive you crazy until you have an "AHA!" moment and it makes sense.

Perl is a good way to practice; PCRE was created with that syntax in mind...If I recall, the SB "RE" module is posix-based, meaning it uses the posix syntax, which is an entirely different way of doing stuff.

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #33 on: November 05, 2018, 01:51:33 PM »
Keep in mind that the split BY can consist of more than one character.

Quote
If I recall, the SB "RE" module is posix-based, meaning it uses the posix syntax, which is an entirely different way of doing stuff.

It now makes sense why none of my testing of the RE module worked.
« Last Edit: November 05, 2018, 01:59:50 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #34 on: November 05, 2018, 09:14:00 PM »
Another feature I'm planning on adding to the Script BASIC EXTRACT is if the user passes a negative value in the optional third argument, EXTRACT will act LIKE a function and return the MATCH (just the value  MATCH[-x,0]) for the place holder position rather than the default occurance count. If no optional argument is passed, EXTRACT emulates SB native LIKE returning TRUE / FALSE.

Curious if BaCon can return various types? (numeric or a string)

« Last Edit: November 05, 2018, 11:40:07 PM by John »

Offline AIR

  • RETIRED
  • BASIC Developer
  • Posts: 284
Re: LIKE +
« Reply #35 on: November 05, 2018, 09:58:26 PM »
I don't think you can do that in Bacon with an array/associate array, but you should be able to do it using an array of RECORDs (struct, in C parlance).  But the number of entries per RECORD is fixed and typed so if you set up a string as a field, you can't directly place a number in there.  You may be able to just set up longs and pass the address of the variables, but then you get into some C-pointer like territory...

SB finagles this by essentially treating everything as a string, so you're able to do this in your arrays....

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #36 on: November 05, 2018, 11:25:08 PM »
SB doesn't allow returning arrays. You have to pass them back and forth as byref (default) arguments. SB variables are in the form of a variant.

The MATCH array is created in the EXTRACT function but it's a global variable not a local array.

« Last Edit: November 06, 2018, 09:19:56 AM by John »

Offline AIR

  • RETIRED
  • BASIC Developer
  • Posts: 284
Re: LIKE +
« Reply #37 on: November 09, 2018, 10:06:06 PM »
Hey John, did you ever finish the SB version?

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #38 on: November 10, 2018, 07:53:00 AM »
Close but got sidetracked playing in the sandbox. I hope to have some time this weekend to finish it.
 

Offline AIR

  • RETIRED
  • BASIC Developer
  • Posts: 284
Re: LIKE +
« Reply #39 on: November 10, 2018, 11:43:49 AM »
I figured as much.  I'm really interested in what you come up with, especially the lower level module bits. 


Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #40 on: November 10, 2018, 08:18:26 PM »
This is a very early release that gets the patterns MATCH items but not the other data that surrounds it (non-relevent jokers)

Code: Script BASIC
  1. ' EXTRACT (LIKE +) Code Challenge - Script BASIC / JRS
  2.  
  3. html = """<!DOCTYPE html>
  4. <html>
  5.  <head>
  6.    <title>AllBASIC.INFO Forum LIKE Code Challenge</title>
  7.  </head>
  8.  <body>
  9. LIKE it or don't.
  10.  </body>
  11. </html>"""
  12.  
  13. FUNCTION EXTRACT(base, mask)
  14.   UNDEF MATCH
  15.   start = 1
  16.   match_idx = 2
  17.   SPLITA mask BY "*" TO patterns
  18.   FOR idx = 0 TO UBOUND(patterns) STEP 2
  19.     MATCH[match_idx, 1] = INSTR(base, patterns[idx], start)
  20.     start = MATCH[match_idx, 1]
  21.     MATCH[match_idx, 2] = INSTR(base, patterns[idx + 1], start)
  22.     start = MATCH[match_idx, 2]
  23.     MATCH[match_idx, 1] = MATCH[match_idx, 1] + LEN(patterns[idx])
  24.     MATCH[match_idx, 2] = MATCH[match_idx, 2] - MATCH[match_idx, 1]
  25.     MATCH[match_idx, 0] = MID(base, MATCH[match_idx, 1], MATCH[match_idx, 2])
  26.     match_idx += 2
  27.   NEXT
  28. END FUNCTION
  29.  
  30. EXTRACT html, "*<title>*</title>*<body>*</body>*"
  31.  
  32. PRINT "Title:  ", MATCH[2, 0],"\n"
  33. PRINT "Start:  ", MATCH[2, 1],"\n"
  34. PRINT "Length: ", MATCH[2, 2],"\n"
  35. PRINTNL
  36. PRINT "Body:   ", MATCH[4, 0],"\n"
  37. PRINT "Start:  ", MATCH[4, 1],"\n"
  38. PRINT "Length: ", MATCH[4, 2],"\n"
  39.  


$ time scriba extract.sb
Title:  AllBASIC.INFO Forum LIKE Code Challenge
Start:  44
Length: 39

Body:   
LIKE it or don't.
 
Start:  110
Length: 21

real   0m0.007s
user   0m0.008s
sys   0m0.000s
$


BTW: Firefox for Windows is now at 63.0.1.
« Last Edit: November 10, 2018, 09:44:41 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #41 on: November 11, 2018, 08:47:19 AM »
The EXTRACT function seems to work fine with the Firefox download challenge replacing LIKE. One issue I have discovered with this version of EXTRACT is a pattern needs to be more than one character.

Code: Script BASIC
  1. ' Firefox Download Challenge (Script BASIC Version) by JRS.
  2.  
  3. INCLUDE curl.bas
  4.  
  5. FUNCTION EXTRACT(base, mask)
  6.   UNDEF MATCH
  7.   start = 1
  8.   match_idx = 2
  9.   SPLITA mask BY "*" TO patterns
  10.   FOR idx = 0 TO UBOUND(patterns) STEP 2
  11.     MATCH[match_idx, 1] = INSTR(base, patterns[idx], start)
  12.     start = MATCH[match_idx, 1]
  13.     MATCH[match_idx, 2] = INSTR(base, patterns[idx + 1], start)
  14.     start = MATCH[match_idx, 2]
  15.     MATCH[match_idx, 1] = MATCH[match_idx, 1] + LEN(patterns[idx])
  16.     MATCH[match_idx, 2] = MATCH[match_idx, 2] - MATCH[match_idx, 1]
  17.     MATCH[match_idx, 0] = MID(base, MATCH[match_idx, 1], MATCH[match_idx, 2])
  18.     match_idx += 2
  19.   NEXT
  20.   EXTRACT = UBOUND(patterns) + 1 / 2
  21. END FUNCTION
  22.  
  23.  
  24. ch = curl::init()
  25. curl::option(ch,"URL","https://www.mozilla.org/en-US/firefox/new/")
  26. wp = curl::perform(ch)
  27.  
  28. IF EXTRACT(wp, """*data-latest-firefox="*" data-esr-versions*<div id="other-platforms">*<li class="os_win64">*<a href="*"
  29.            class*""") THEN
  30.   PRINT "Downloading Latest 64Bit Firefox (",MATCH[2, 0],") for Windows.\n"
  31.   curl::option(ch,"FOLLOWLOCATION", 1)
  32.   curl::option(ch,"NOPROGRESS",0)
  33.   curl::option(ch,"FILE","Firefox_Setup-" & MATCH[2, 0] & ".exe")
  34.   curl::option(ch,"URL", MATCH[6, 0])
  35.   curl::perform(ch)
  36.   PRINTNL
  37.   PRINT "Firefox_Setup-" & MATCH[2, 0] & ".exe Downloaded ",FORMAT("%~##,###,###~ Bytes",curl::info(ch,"SIZE_DOWNLOAD")), _
  38.   " at ",FORMAT("%~##,###,###~ Bytes/Second",curl::info(ch,"SPEED_DOWNLOAD")),".\n"
  39. ELSE
  40.   PRINT "<< ERROR >>\n"
  41. END IF
  42.  
  43. curl::finish(ch)
  44.  


$ time scriba ff_extract.sb
Downloading Latest 64Bit Firefox (63.0.1) for Windows.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   134  100   134    0     0    611      0 --:--:-- --:--:-- --:--:--   611
100 42.3M  100 42.3M    0     0  22.1M      0  0:00:01  0:00:01 --:--:-- 34.3M

Firefox_Setup-63.0.1.exe Downloaded 44,396,144 Bytes at 23,219,740 Bytes/Second.

real   0m2.201s
user   0m1.075s
sys   0m0.214s
$

« Last Edit: November 11, 2018, 12:49:29 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #42 on: November 12, 2018, 05:07:15 PM »
I will be posting another update to EXTRACT with better error handling and an optional argument to specify which MATCH item you want returned by EXTRACT rather than the total MATCH count. Multiple occurrences gets unmanageable deciphering which MATCH index to select for a given occurrence.


Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #43 on: November 12, 2018, 06:08:37 PM »
Script BASIC 

Note: Only JOKERS surrounded by match patterns are returned. Others are ignored. If you need them, use LIKE. I have REM'ed the UNDEF MATCH in this example to show both options EXTRACT offers. I tested the EXTRACT function with the Firefox download page and it returned the correct number of JOKERS and returned the data requested using the optional argument.

Code: Script BASIC
  1. '' EXTRACT (LIKE +) Code Challenge - Script BASIC / JRS
  2.  
  3. html = """<!DOCTYPE html>
  4. <html>
  5.  <head>
  6.    <title>AllBASIC.INFO Forum LIKE Code Challenge</title>
  7.  </head>
  8.  <body>
  9. LIKE it or don't.
  10.  </body>
  11. </html>"""
  12.  
  13. FUNCTION EXTRACT(base, mask, select)
  14.   UNDEF MATCH
  15.   start = 1
  16.   match_idx = 2
  17.   SPLITA mask BY "*" TO patterns
  18.   FOR idx = 0 TO UBOUND(patterns) STEP 2
  19.     MATCH[match_idx, 1] = INSTR(base, patterns[idx], start)
  20.     IF MATCH[match_idx, 1] <> undef THEN
  21.       start = MATCH[match_idx, 1]
  22.     ELSE
  23.       GOTO MATCH_ERROR
  24.     END IF
  25.     MATCH[match_idx, 2] = INSTR(base, patterns[idx + 1], start)
  26.     IF MATCH[match_idx, 2] <> undef THEN
  27.       start = MATCH[match_idx, 2]
  28.     ELSE
  29.       GOTO MATCH_ERROR
  30.     END IF
  31.     MATCH[match_idx, 1] = MATCH[match_idx, 1] + LEN(patterns[idx])
  32.     MATCH[match_idx, 2] = MATCH[match_idx, 2] - MATCH[match_idx, 1]
  33.     MATCH[match_idx, 0] = MID(base, MATCH[match_idx, 1], MATCH[match_idx, 2])
  34.     match_idx += 2
  35.   NEXT
  36.   IF select <> undef AND select <= UBOUND(patterns) + 1 THEN
  37.     EXTRACT = MATCH[select, 0]
  38.     ' UNDEF MATCH
  39.  ELSE
  40.     EXTRACT = UBOUND(patterns) + 2
  41.   END IF
  42.   GOTO DONE
  43.  
  44. MATCH_ERROR:
  45.   EXTRACT = 0
  46.  
  47. DONE:
  48. END FUNCTION
  49.  
  50. PRINT EXTRACT(html, "*<title>*</title>*<body>*</body>*", 2), "\n"
  51. PRINTNL
  52. PRINT "Title:  ", MATCH[2, 0],"\n"
  53. PRINT "Start:  ", MATCH[2, 1],"\n"
  54. PRINT "Length: ", MATCH[2, 2],"\n"
  55. PRINTNL
  56. PRINT "Body:   ", MATCH[4, 0],"\n"
  57. PRINT "Start:  ", MATCH[4, 1],"\n"
  58. PRINT "Length: ", MATCH[4, 2],"\n"
  59.  


$ time scriba extract.sb
AllBASIC.INFO Forum LIKE Code Challenge

Title:  AllBASIC.INFO Forum LIKE Code Challenge
Start:  44
Length: 39

Body:   
LIKE it or don't.
 
Start:  110
Length: 21

real   0m0.009s
user   0m0.005s
sys   0m0.004s
$
« Last Edit: November 12, 2018, 11:03:55 PM by John »

Offline John

  • Forum Support / SB Dev
  • Posts: 2028
    • ScriptBasic Open Source Project
Re: LIKE +
« Reply #44 on: November 13, 2018, 02:59:58 PM »
Is there anyone besides me willing to post a working EXTRACT / MATCH submission for this code challenge?