Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
hbag
Feb 13, 2021

leper khan posted:

So use wc to determine how many lines are in it, subtract n, then use head to grab the first len-n lines

weird, i tried that earlier and it didnt work, but now it does

Adbot
ADBOT LOVES YOU

CopperHound
Feb 14, 2012

I'm trying to get familiar with SQL and django and ran into one problem with this schema that is driving me nuts:


I want to be able to Select subscriptions by their next bill date which is calculated by:
last billed + interval + pause duration for all pauses the begin before next bill

The first wrong solution that comes to mind is to make a view like this:
code:
SELECT
	s.*,
	ADDTIME(s.last_billed + INTERVAL interval_months MONTH + INTERVAL interval_weeks WEEK,SUM(TIMEDIFF(p.end,p.begin))) AS next_bill
FROM subscription AS s
LEFT JOIN pause p
	ON s.id = p.subscription_id
WHERE p.begin >= s.last_billed AND p.begin < (s.last_billed + INTERVAL interval_months MONTH + INTERVAL interval_weeks WEEK)
GROUP BY s.id
But this leaves out pauses that should be included because of the new extended bill date.

In my django app I COULD iterate through the pauses for each subscription, but that would mean making a separate remote database call for EVERY row in the subscription table.

Is there something obvious I'm missing here, or should I have a denormalized next_bill column that gets updated by something like a trigger?

hbag
Feb 13, 2021

alright, i got another issue
im using sed to isolate the quotes for my script, but when i do it's putting the whole quote on one line, instead of seperate lines like it is before i isolate them with sed
for example:

Before isolating the quotes:


After isolating the quotes:


Here's the code I'm using to isolate the quotes, I'd appreciate any advice with how I can get it to include the newline characters.

code:
QUOTE1=$(sed -n '/1\./,/2\./p' saforumscrapetidy)
echo $QUOTE1
('saforumscrapetidy' is the name of a file storing a neater version of the response to the POST request)

xtal
Jan 9, 2011

by Fluffdaddy

hbag posted:

alright, i got another issue
im using sed to isolate the quotes for my script, but when i do it's putting the whole quote on one line, instead of seperate lines like it is before i isolate them with sed
for example:

Before isolating the quotes:


After isolating the quotes:


Here's the code I'm using to isolate the quotes, I'd appreciate any advice with how I can get it to include the newline characters.

code:
QUOTE1=$(sed -n '/1\./,/2\./p' saforumscrapetidy)
echo $QUOTE1
('saforumscrapetidy' is the name of a file storing a neater version of the response to the POST request)

Is that the same as the result of sed without the QUOTE= and echo?

hbag
Feb 13, 2021

xtal posted:

Is that the same as the result of sed without the QUOTE= and echo?

Nope. Doing it without putting it in a variable seems to work. I guess I can work with this.

xtal
Jan 9, 2011

by Fluffdaddy

hbag posted:

Nope. Doing it without putting it in a variable seems to work. I guess I can work with this.

This probably would work then:

code:

QUOTE1="$(sed -n '/1\./,/2\./p' saforumscrapetidy)"
echo "$QUOTE1" 

Don't try and figure out how quotes work in Bash, they make no sense and you should just use a real scripting language at this point. But for more reading, the reason for what you're seeing is the IFS variable.

hbag
Feb 13, 2021

xtal posted:

This probably would work then:

code:
QUOTE1="$(sed -n '/1\./,/2\./p' saforumscrapetidy)"
echo "$QUOTE1" 
Don't try and figure out how quotes work in Bash, they make no sense and you should just use a real scripting language at this point. But for more reading, the reason for what you're seeing is the IFS variable.

Yep, that worked great, thanks.

hbag
Feb 13, 2021

code:
line 17: wc -l < /tmp/tmp.DUggRX0SNv5Ya9: syntax error: invalid arithmetic operator (error token is ".DUggRX0SNv5Ya9")
agony

xtal
Jan 9, 2011

by Fluffdaddy

hbag posted:

code:
line 17: wc -l < /tmp/tmp.DUggRX0SNv5Ya9: syntax error: invalid arithmetic operator (error token is ".DUggRX0SNv5Ya9")
agony

What's the whole code?

KillHour
Oct 28, 2007


CopperHound posted:

I'm trying to get familiar with SQL and django and ran into one problem with this schema that is driving me nuts:


I want to be able to Select subscriptions by their next bill date which is calculated by:
last billed + interval + pause duration for all pauses the begin before next bill

The first wrong solution that comes to mind is to make a view like this:
code:
SELECT
	s.*,
	ADDTIME(s.last_billed + INTERVAL interval_months MONTH + INTERVAL interval_weeks WEEK,SUM(TIMEDIFF(p.end,p.begin))) AS next_bill
FROM subscription AS s
LEFT JOIN pause p
	ON s.id = p.subscription_id
WHERE p.begin >= s.last_billed AND p.begin < (s.last_billed + INTERVAL interval_months MONTH + INTERVAL interval_weeks WEEK)
GROUP BY s.id
But this leaves out pauses that should be included because of the new extended bill date.

In my django app I COULD iterate through the pauses for each subscription, but that would mean making a separate remote database call for EVERY row in the subscription table.

Is there something obvious I'm missing here, or should I have a denormalized next_bill column that gets updated by something like a trigger?

Are there going to be a lot of future pauses? Just join the two tables where pauses begin after the last bill date and figure out the math on your app.

How often is this data going to be updated vs. viewed in the app? Deciding whether to denormalize depends heavily on that because you want to do the math as infrequently as possible.

Comedy answer: Use a document NoSQL database and store the pauses as an array in the same document.

KillHour fucked around with this message at 15:47 on Feb 18, 2021

CopperHound
Feb 14, 2012

KillHour posted:

Are there going to be a lot of future pauses? Just join the two tables where pauses begin after the last bill date and figure out the math on your app.
Not a lot. Usually there isn't more than one future pause scheduled at a time, but I want to be prepared for the edge cases.

KillHour posted:

How often is this data going to be updated vs. viewed in the app? Deciding whether to denormalize depends heavily on that because you want to do the math as infrequently as possible.
This is an update infrequently read often situation.
I stripped the schema down to the minimum for my question but there is also fields for requested subscription end date and a soft end boolean for the subscription to be ended at the end of the bill cycle.
Most of the selects will filter by active subscriptions and will need to read the next bill date for every subscription that has the soft end flag and is past the requested end date.

KillHour
Oct 28, 2007


CopperHound posted:

This is an update infrequently read often situation.

Denormalize the next bill date.

SkyeAuroline
Nov 12, 2020

Because of a series of unfortunate events, I've been compelled to write macros in (modified) Visual Basic at work. Not VBA, the "dead since '08" original; it's the only thing our software will interface with. I have almost 0 programming experience but I'm attempting to stitch together functional code, and actually kind of succeeding, with only one running into issues so far. I have a script that's intended to send the contents of one field over to the clipboard. This is replacing the only part of the workflow that requires a mouse, to hopefully speed things up. Unfortunately, it's not speeding things up, because it keeps throwing a "can't access clipboard" error every 2 or 3 times I run it, sometimes several in a row. I'm aware that this is a limitation of accessing the Windows clipboard due to other programs accessing it, though I have no idea what else would be. Normally this would be fine and I would just run it again, but a) it causes the program to halt and open extra windows every time, b) this is integrated into the save function as part of spaghetti code I have no control over & repeatedly overwriting the same data just to get the clipboard to work is pointless.

I know "don't use VB" is the best advice here, but barring that, what do I need to do to minimize these collisions with other programs for the clipboard? Plain language is fine if that's needed, I'll interpret from there. No direct access from SA to the computer I'm on but can also copy out code if needed.

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

SkyeAuroline posted:

Because of a series of unfortunate events, I've been compelled to write macros in (modified) Visual Basic at work. Not VBA, the "dead since '08" original; it's the only thing our software will interface with. I have almost 0 programming experience but I'm attempting to stitch together functional code, and actually kind of succeeding, with only one running into issues so far. I have a script that's intended to send the contents of one field over to the clipboard. This is replacing the only part of the workflow that requires a mouse, to hopefully speed things up. Unfortunately, it's not speeding things up, because it keeps throwing a "can't access clipboard" error every 2 or 3 times I run it, sometimes several in a row. I'm aware that this is a limitation of accessing the Windows clipboard due to other programs accessing it, though I have no idea what else would be. Normally this would be fine and I would just run it again, but a) it causes the program to halt and open extra windows every time, b) this is integrated into the save function as part of spaghetti code I have no control over & repeatedly overwriting the same data just to get the clipboard to work is pointless.

I know "don't use VB" is the best advice here, but barring that, what do I need to do to minimize these collisions with other programs for the clipboard? Plain language is fine if that's needed, I'll interpret from there. No direct access from SA to the computer I'm on but can also copy out code if needed.

tbh without the section of the code that gets poo poo from the clipboard, hard to say. interfacing with the windows clipboard is unironically rocket science and there's lots of thing that can go wrong, not matter what language you do it from.

SkyeAuroline
Nov 12, 2020

Amending the statement: it independently claims to be VB-derived and not VBA, and VBA-derived. Whatever. It's SmarTerm Macro. Very short currently so from my phone:
pre:
Sub ExportSKU
  dim ScnText as string

  Clipboard.clear
  ScnText = Session.ScreenText(3,17,1,9)
  Clipboard$ ScnText
  Session.Send "^[[26~~"
End Sub
Clears existing clipboard content (I'm not sure that this step is necessary, but it's in the manual examples so I kept it here), sets the export string to the 9-digit string I'm copying, sends to clipboard, sends the string that triggers the program's save function.

Since I have very little idea what I'm doing I kept it as simple as possible, which means 0 redundancy or anything whatsoever. Like I said, not a programmer myself, just picking up slack where our IT team hasn't been able to advise.

hbag
Feb 13, 2021

xtal posted:

What's the whole code?

sorry i didnt respond last night, was probed
here's the relevant code

code:
TRIMSCRAPE=$(mktemp /tmp/tmp.XXXXXXXXXX)
QUOTE1=$(mktemp /tmp/tmp.XXXXXXXXXXXXXX)
...
sed -n '/1\./,/2\./p' "$TRIMSCRAPE" > "$QUOTE1"
sed -n -e '2,$p' "$QUOTE1" | head -n $((wc -l < "$QUOTE1")) "$QUOTE1"

xtal
Jan 9, 2011

by Fluffdaddy
Change from $(()) to $() around wc. One paren is for subshell, two is for arithmetic evaluation. You might want to try out https://www.shellcheck.net/ on the whole script too.

hbag
Feb 13, 2021

xtal posted:

Change from $(()) to $() around wc. One paren is for subshell, two is for arithmetic evaluation. You might want to try out https://www.shellcheck.net/ on the whole script too.

Yep, that worked, thanks. I should stop trying to program when I'm exhausted.
also, go back to C-SPAM

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

SkyeAuroline posted:

Amending the statement: it independently claims to be VB-derived and not VBA, and VBA-derived. Whatever. It's SmarTerm Macro. Very short currently so from my phone:
pre:
Sub ExportSKU
  dim ScnText as string

  Clipboard.clear
  ScnText = Session.ScreenText(3,17,1,9)
  Clipboard$ ScnText
  Session.Send "^[[26~~"
End Sub
Clears existing clipboard content (I'm not sure that this step is necessary, but it's in the manual examples so I kept it here), sets the export string to the 9-digit string I'm copying, sends to clipboard, sends the string that triggers the program's save function.

Since I have very little idea what I'm doing I kept it as simple as possible, which means 0 redundancy or anything whatsoever. Like I said, not a programmer myself, just picking up slack where our IT team hasn't been able to advise.

So let me give you an example of what interfacing with the clipboard might look like in vb (hastily googled)
code:
Option Explicit
 
        Declare Function GlobalUnlock Lib "kernel32" (ByVal hMem As Long) As Long
        Declare Function GlobalLock Lib "kernel32" (ByVal hMem As Long) As Long
        Declare Function GlobalAlloc Lib "kernel32" (ByVal wFlags As Long, _
                                                     ByVal dwBytes As Long) As Long

        Declare Function CloseClipboard Lib "User32" () As Long
        Declare Function OpenClipboard Lib "User32" (ByVal hwnd As Long) As Long
        Declare Function EmptyClipboard Lib "User32" () As Long

        Declare Function lstrcpy Lib "kernel32" (ByVal lpString1 As Any, _
                                                 ByVal lpString2 As Any) As Long

        Declare Function SetClipboardData Lib "User32" (ByVal wFormat _
                                                        As Long, ByVal hMem As Long) As Long


Public Const GHND = &H42
Public Const CF_TEXT = 1
Public Const MAXSIZE = 4096

Sub ClipBoard_SetData(MyString As String)

         Dim hGlobalMemory As Long
         Dim hClipMemory   As Long
        Dim lpGlobalMemory    As Long

        Dim x                 As Long

        ' Allocate moveable global memory.
        '-------------------------------------------
        hGlobalMemory = GlobalAlloc(GHND, Len(MyString) + 1)

        ' Lock the block to get a far pointer
        ' to this memory.
        lpGlobalMemory = GlobalLock(hGlobalMemory)

        ' Copy the string to this global memory.
        lpGlobalMemory = lstrcpy(lpGlobalMemory, MyString)

        ' Unlock the memory.
        If GlobalUnlock(hGlobalMemory) <> 0 Then
            MsgBox "Could not unlock memory location. Copy aborted."
            GoTo OutOfHere2
        End If

        ' Open the Clipboard to copy data to.
        If OpenClipboard(0&) = 0 Then
            MsgBox "Could not open the Clipboard. Copy aborted."
            Exit Sub
        End If

        ' Clear the Clipboard.
        x = EmptyClipboard()

        ' Copy the data to the Clipboard.
        hClipMemory = SetClipboardData(CF_TEXT, hGlobalMemory)

OutOfHere2:

        If CloseClipboard() = 0 Then
            MsgBox "Could not close Clipboard."
        End If
    #End If

End Sub
There's probably an implementation of clipboard in your scripting language, and it's probably not doing something right, as you need to copy stuff into global memory etc. to use the clipboard... There's also the possibility that something else is trying to change the clipboard at the same time you're using it (e.g. a clipboard reader program like clipspy).

Bruegels Fuckbooks fucked around with this message at 21:58 on Feb 18, 2021

SkyeAuroline
Nov 12, 2020

Bruegels Fuckbooks posted:

So let me give you an example of what interfacing with the clipboard might look like in vb (hastily googled)
[quote]
Option Explicit

Declare Function GlobalUnlock Lib "kernel32" (ByVal hMem As Long) As Long
Declare Function GlobalLock Lib "kernel32" (ByVal hMem As Long) As Long
Declare Function GlobalAlloc Lib "kernel32" (ByVal wFlags As Long, _
ByVal dwBytes As Long) As Long

Declare Function CloseClipboard Lib "User32" () As Long
Declare Function OpenClipboard Lib "User32" (ByVal hwnd As Long) As Long
Declare Function EmptyClipboard Lib "User32" () As Long

Declare Function lstrcpy Lib "kernel32" (ByVal lpString1 As Any, _
ByVal lpString2 As Any) As Long

Declare Function SetClipboardData Lib "User32" (ByVal wFormat _
As Long, ByVal hMem As Long) As Long


Public Const GHND = &H42
Public Const CF_TEXT = 1
Public Const MAXSIZE = 4096

Sub ClipBoard_SetData(MyString As String)

Dim hGlobalMemory As Long
Dim hClipMemory As Long
Dim lpGlobalMemory As Long

Dim x As Long

' Allocate moveable global memory.
'-------------------------------------------
hGlobalMemory = GlobalAlloc(GHND, Len(MyString) + 1)

' Lock the block to get a far pointer
' to this memory.
lpGlobalMemory = GlobalLock(hGlobalMemory)

' Copy the string to this global memory.
lpGlobalMemory = lstrcpy(lpGlobalMemory, MyString)

' Unlock the memory.
If GlobalUnlock(hGlobalMemory) <> 0 Then
MsgBox "Could not unlock memory location. Copy aborted."
GoTo OutOfHere2
End If

' Open the Clipboard to copy data to.
If OpenClipboard(0&) = 0 Then
MsgBox "Could not open the Clipboard. Copy aborted."
Exit Sub
End If

' Clear the Clipboard.
x = EmptyClipboard()

' Copy the data to the Clipboard.
hClipMemory = SetClipboardData(CF_TEXT, hGlobalMemory)

OutOfHere2:

If CloseClipboard() = 0 Then
MsgBox "Could not close Clipboard."
End If
#End If

End Sub
[code]

There's probably an implementation of clipboard in your scripting language, and it's probably not doing something right, as you need to copy stuff into global memory etc. to use the clipboard... There's also the possibility that something else is trying to change the clipboard at the same time you're using it (e.g. a clipboard reader program like clipspy).

Aight, so that's terrifying (and also way longer). The good news is that while talking through it out loud with our IT dude, I figured out error catching, and it seems like it's robust enough to stop any issues. Cycles through a few retries with delays between, throws an error message if it can't get through after that. For an amateur just putting something together for accessibility, this is probably good enough - through about 80 runs of it in the actual software with no hiccups yet.

Thanks for the advice & example.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
Fair warning, I've been drinking

melon cat posted:

Python beginner question. I've noticed that a lot of Python tutorials (like this one which explains how to make a blog with Python) use terminal to install stuff and run commands. This is easy on a Mac since you can run Terminal easily on that IS. But I have no idea how this is done in Windows 10. And every resource I've looked at always shits on Windows 10 for its broken cmd prompt.

Am I better off learning and using and learning Python on Ubuntu? Because I really don't mind dual booting Windows + Ubuntu. I'm just getting kind of irritated that most tutorials assume you're on a Mac/Linux and I think I've spent way too much time trying to figure out how to get these Terminal commands working on Windows.

Hot take: I exclusively dev on Win 10. WSL is a fools errand. You need Anaconda for a while and to use conda envs if you're developing for general purpose stuff. When you need to deploy poo poo to the web like flask or django you'll wanna run docker and maybe switch to virtualenv, though I highly recommend skipping figuring out docker and just adopt AWS SAM. gently caress servers and gently caress configuring them. If you wanna put django on the internet using Win10 use Zappa at first and when you need to config poo poo beyond that use AWS SAM to deploy to AWS Lambda. Here's how to get a django site on AWS Lambda in 15 minutes for free: https://romandc.com/zappa-django-guide/

hbag posted:

...now to figure out which cookie I need. none of these seem to really be standing out, but I might just have a smooth brain.

Here's a lazy rear end, slow but faster than figuring this bullshit out way: Get whatever site you want with selenium and loin the old fashioned way. When you wanna pass logins between session just use selenium's get_cookies and set_cookie methods. Couldn't be easier. I use this when I want to scrape a search result in parallel to do something like: 1) Login to website 2) Search for thing, getting the cookie from the logged in session. 3) Pass the cookie from 2) to multiple parallel lambda functions

CarForumPoster fucked around with this message at 06:07 on Feb 20, 2021

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

CarForumPoster posted:

WSL is a fools errand.

How so?

(I have no experience with it whatsoever and haven't done Windows dev for years, so I'm inclined to believe you.)

CarForumPoster
Jun 26, 2013

⚡POWER⚡

pokeyman posted:

How so?

(I have no experience with it whatsoever and haven't done Windows dev for years, so I'm inclined to believe you.)

If you're using Windows I'd wage everything you need is available on Windows. I say this having deployed several python ML models, web scrapers and apps (all Django, Flask, or some code running on AWS Lambda or EBS). All dev'd on Win10. You might not be able to run the midnight build of pyTorch/FastAI, but usually last stable of any package will do if you're on Windows.

If you're developing a project that needs to be deployed and you want to test in a prod-like environment, using WSL is dumb because it doesn't emulate your prod environment. Its just regular linux. That's the entire purpose of docker, to exactly-as-possible replicate your prod environment. So just use Docker Desktop and its handy CLI if you need to interact with it rather than trying to figure out how to SSH in.

So I arrive at the conclusion that the probability of anyone actually benefitting from WSL for the purposes of python development specifically is almost nothing and I say this as someone who has WSL2 Ubuntu installed. WSL has its place, but prob not the right choice for python development if your preferred OS is Windows.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
Seems logical! Thanks for explaining :)

nielsm
Jun 1, 2009



Bruegels Fuckbooks posted:

There's probably an implementation of clipboard in your scripting language, and it's probably not doing something right, as you need to copy stuff into global memory etc. to use the clipboard... There's also the possibility that something else is trying to change the clipboard at the same time you're using it (e.g. a clipboard reader program like clipspy).

I've definitely had copy-to-clipboard operations randomly failing in basic .NET Framework WPF applications, just placing a simple text string on the clipboard using the built-in classes. Random failures are just part of the game, if you really really need to place something on there you're better off retrying several times before you give up.

The Windows clipboard really is a complex beast yes, and it's possible to do lots of things with it.
- You can put data in multiple alternative formats on the clipboard at once. For example, plain text, RTF text, HTML, and a rendered bitmap of the text.
- You can put just the promise of being able to deliver data on demand on the clipboard. So when somebody initiates a paste operation, your program receives a message to actually deliver the data in whatever format you promised to be able to.

And yes the clipboard is a global resource in the login session (or maybe desktop object? not sure) with all the possible race conditions and locking issues that can cause, when disparate programs all want to play with it.

hbag
Feb 13, 2021

how could i delete everything in a string EXCEPT what my regex matches?
the pattern im using right now excludes all the other LINES, sure, but i only want the string itself that matches, not the entire line

CarForumPoster
Jun 26, 2013

⚡POWER⚡

hbag posted:

how could i delete everything in a string EXCEPT what my regex matches?
the pattern im using right now excludes all the other LINES, sure, but i only want the string itself that matches, not the entire line

In Python I believe you just use re.search or re.findall which are part of the std lib. I’d imagine most other language have analogous things.

This wouldn’t delete everything else of course, it’d simply extract whatever the pattern matches.

hbag
Feb 13, 2021

CarForumPoster posted:

In Python I believe you just use re.search or re.findall which are part of the std lib. I’d imagine most other language have analogous things.

This wouldn’t delete everything else of course, it’d simply extract whatever the pattern matches.

I'd prefer to do it with either bash or perl, but it's good to know I've got that as a backup.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

hbag posted:

how could i delete everything in a string EXCEPT what my regex matches?
the pattern im using right now excludes all the other LINES, sure, but i only want the string itself that matches, not the entire line

Can you replace the line with the match? Maybe capture group 0 is the match. Otherwise, add parens around your regex and grab group 1.

hbag
Feb 13, 2021

pokeyman posted:

Can you replace the line with the match? Maybe capture group 0 is the match. Otherwise, add parens around your regex and grab group 1.

I have no idea how I'd even do that. Everything I've found on Stack Overflow CLAIMS to do what I want, but it doesn't. It grabs the entire line. Every. Single. One.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
What tool are you using that is asking you to use a regex?

hbag
Feb 13, 2021

Jabor posted:

What tool are you using that is asking you to use a regex?

I've tried sed, grep, awk, etc. My script sends a POST request to the search page and then trims down the response. So far, it works pretty well, but I'm just having trouble with trimming down the username of the goon who made a post so that I can actually put it in the script.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
You're trying to do this as part of a shell script? You might find it helpful to switch to something like Python, which has first-class support for manipulating text instead of having to run other programs to tweak stuff into the desired form.

If you want to stick with a shell script, you can do this with awk:
code:
awk 'match($0, "(.*) posted:", m) {print m[1]}'
The trick is to use a capturing group on the part of the regex that contains the thing you're actually interested in, and only printing out the contents of that group.

hbag
Feb 13, 2021

Jabor posted:

You're trying to do this as part of a shell script? You might find it helpful to switch to something like Python, which has first-class support for manipulating text instead of having to run other programs to tweak stuff into the desired form.

If you want to stick with a shell script, you can do this with awk:
code:
awk 'match($0, "(.*) posted:", m) {print m[1]}'
The trick is to use a capturing group on the part of the regex that contains the thing you're actually interested in, and only printing out the contents of that group.

I can't switch to Python, I've already written the entire rest of the script in bash. I have to do this.
And while your script looks useful, I'm having a hard time reading it due to... not knowing how awk works, really. manpages are often too verbose for my tiny, smooth brain

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


It might be faster to rewrite everything in Python than to figure out how to do what you want in bash.

hbag
Feb 13, 2021

ultrafilter posted:

It might be faster to rewrite everything in Python than to figure out how to do what you want in bash.

oh, absolutely, but this has turned into a principle thing now

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Awk is a complicated and fiddly scripting language that shouldn't be used for actually writing programs, just tiny one-liners that do a single text manipulation.

For this case specifically, look for the match function in the manual:
https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html

The inputs are the line of text ($0), the regular expression (which I assume you've already written, maybe just need to add parentheses for a capturing group to), and an output array (which we've just called m).
If a match is found on a given line, we print out m[1], which is the contents of the first capturing group. (m[0] would be the entire match, and other indices would have the contents of other capturing groups, if any).
If there's no match, we don't print anything, effectively skipping that line.

hbag
Feb 13, 2021

Jabor posted:

Awk is a complicated and fiddly scripting language that shouldn't be used for actually writing programs, just tiny one-liners that do a single text manipulation.

For this case specifically, look for the match function in the manual:
https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html

The inputs are the line of text ($0), the regular expression (which I assume you've already written, maybe just need to add parentheses for a capturing group to), and an output array (which we've just called m).
If a match is found on a given line, we print out m[1], which is the contents of the first capturing group. (m[0] would be the entire match, and other indices would have the contents of other capturing groups, if any).
If there's no match, we don't print anything, effectively skipping that line.

...Alright, I tried that, and I got an error. I might just be doing something wrong, but the system I'm writing this on might also just be acting like a prick... again.
Here's the command I used:
code:
awk 'match($0, "by (.*) in", m) {print m[1]}'
I'm trying to get any text that's between the words 'by' and 'in'. Here's the error I got:
code:
awk: syntax error at source line 1
 context is
        match($0, "by (.*) >>>  in", <<<
awk: bailing out at source line 1

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Sounds like your version of awk doesn't support extracting capturing groups from the match.

Some options:
- If you have gawk on your system, try that instead
- Strongly consider rewriting what you already have using Python or some other programming language. You'll likely find that writing it the second time, with your older code is a reference, is far far easier than writing it the first time was.

Adbot
ADBOT LOVES YOU

hbag
Feb 13, 2021

gawk worked great, thanks

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply