Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
ToxicFrog
Apr 26, 2008


Jose Cuervo posted:

When I have a separate *.bib file for each paper, the *.bib file is contained in the same folder that contains the *.tex files for the paper. When I send them the latest version of the paper, all the necessary files are zipped in one folder so that all they need to do is extract the files to a single folder and can go from there. With the master bib file idea, they would then need to put the master bib file in the appropriate (relative) folder, which is one step to many for my advisors.

In that case the symlink idea will work perfectly. Merge all of the files, create a symlink in each paper's folder to the master file. When you zip the folder, the zip file will contain the master file the symlink pointed to when you zipped it, so anyone unzipping it will have the right file in the right place, no fuss.

As an added bonus, this way you don't need to re-run the script every time you modify the master file!

quote:

I know what I want is probably overkill, but I would like to make certain that each time I make a change to a citation, that exact change occurs everywhere. If you wouldn't mind putting together the example implementation I would appreciate it.

Sure thing.

E: here it is. I offer this mostly as a programming exercise; I still think the symlink approach is better.

Note that as an example, error checking is minimal; if it can't open a file it'll just abort, and if a file already exists that it wants to write it'll be overwritten.

ToxicFrog fucked around with this message at 00:12 on Jul 4, 2012

Adbot
ADBOT LOVES YOU

Jose Cuervo
Aug 25, 2004

ToxicFrog posted:

In that case the symlink idea will work perfectly. Merge all of the files, create a symlink in each paper's folder to the master file. When you zip the folder, the zip file will contain the master file the symlink pointed to when you zipped it, so anyone unzipping it will have the right file in the right place, no fuss.

As an added bonus, this way you don't need to re-run the script every time you modify the master file!


Sure thing.

E: here it is. I offer this mostly as a programming exercise; I still think the symlink approach is better.

Note that as an example, error checking is minimal; if it can't open a file it'll just abort, and if a file already exists that it wants to write it'll be overwritten.

Thanks for this. I am going to look at the example and read up on the symlink stuff.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

Jose Cuervo posted:

Thanks for this. I am going to look at the example and read up on the symlink stuff.

What operating system are you using? On Unix variants it's a simple ln -s master-bibtex-file.bib paper1/bibtex-file.bib on a command line. For Windows the Sysinternals Junction tool does the same thing. I've heard that this tool (or something equivalent) is included with newer versions of Windows, but you'll need to download this if you're using XP. Note that the order of the arguments is backwards; ln follows cp as 'source, destination', whereas junction.exe is 'destination, source'.

Lysidas fucked around with this message at 05:09 on Jul 4, 2012

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

Lysidas posted:

Note that the order of the arguments is backwards; ln follows cp as 'source, destination', whereas junction.exe is 'destination, source'.

I tend to get confused if I think of ln's order as source, dest instead of target, link_name

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Lysidas posted:

I've heard that this tool (or something equivalent) is included with newer versions of Windows

mklink

Jose Cuervo
Aug 25, 2004
Although it is overkill, I managed to code up a script(?) in python based on the help provided here. Thanks for the guidance.

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Otto Skorzeny posted:

I tend to get confused if I think of ln's order as source, dest instead of target, link_name

The use of source and destination in the man pages used to bother the poo poo out of me before I memorized the order. Source of what, now? Destination? Where the link points to? Terrible word choice.

ToxicFrog
Apr 26, 2008


Munkeymon posted:

The use of source and destination in the man pages used to bother the poo poo out of me before I memorized the order. Source of what, now? Destination? Where the link points to? Terrible word choice.

But the man page doesn't actually use "source" and "destination", it uses "target" and "link_name" :confused:

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



ToxicFrog posted:

But the man page doesn't actually use "source" and "destination", it uses "target" and "link_name" :confused:

Huh. I did learn on Solaris 8, so maybe the Linux pages are better or maybe I'm just remembering wrong - wouldn't be the first time :v:

hooligan
Jun 15, 2001

Arrogant Dutch Fuckwit.
Edit: Sorry, wrong topic

hooligan fucked around with this message at 17:22 on Jul 5, 2012

het
Nov 14, 2002

A dark black past
is my most valued
possession

Munkeymon posted:

Huh. I did learn on Solaris 8, so maybe the Linux pages are better or maybe I'm just remembering wrong - wouldn't be the first time :v:
No, I definitely remember this too.

ToxicFrog posted:

But the man page doesn't actually use "source" and "destination", it uses "target" and "link_name" :confused:
Hilariously, I went to look up old man pages and Solaris 9 uses "target" too, except it uses it in place of "link_name", not "source".

Mustach
Mar 2, 2003

In this long line, there's been some real strange genes. You've got 'em all, with some extras thrown in.
code:
% man ln
LN(1)                     BSD General Commands Manual                    LN(1)

NAME
     link, ln -- make links

SYNOPSIS
     ln [-Ffhinsv] source_file [target_file]
This always confuses me, because I think of target as the target of the link, so I just forget about it and remember that it's the same order as for cp.

HFX
Nov 29, 2004

PDP-1 posted:

I believe you can have up to two parallel ports, LPT1 and LPT2. The control/data registers were memory mapped in older Windows systems (I haven't used this stuff past the Win2000 era) so it wasn't possible to have more than that.

I wouldn't be thrilled about building something new on a deprecated IO system like parallel ports, but if you absolutely have to go that way look into getting a couple of USB-to-Parallel converter cables so that at least your control computer doesn't need to have any special hardware. Just make sure that the drivers that come with the cables support your OS (some USB-to-RS232 drivers have problems with Win7, I imagine the same caveat applies here), and buy both cables from one manufacturer since the drivers run at a kernel level and having multiple port emulators from different companies running at the same time will bluescreen your machine on a regular basis.

I actually do a fair bit of work in embedded development where serial and parallel ports are still common. I have experienced problems with some serial ports causing blue screen of deaths for USB. I've also had trouble finding a usb cable parallel port that works functionally as a real parallel port (import for a jtag debugger). Do you have any recommendations on either for Windows 7?

Elston Gunn
Apr 15, 2005

I'm working with hourly data collected by weather stations that comes in a fixed width column format using R. The problem is that after 34 columns of data that is common to every station there is additional data that is not necessarily collected by every station. This additional data is a certain number of characters following a three character ID string. Ultimately I want to search through each file and pull out a few columns from each line that contains the ID code and output to a new file. Is there a way in R that I can do this?

Cup Runneth Over
Aug 8, 2009

She said life's
Too short to worry
Life's too long to wait
It's too short
Not to love everybody
Life's too long to hate


Dear Ruby goons,

No this is not a RoR question. Yes I'm a bad person. I am having some serious trouble with the One-Click Ruby Application gem that I hadn't been having at any point before. It throws this error:

=== Compressing 5914523 bytes
C:/Ruby193/lib/ruby/gems/1.9.1/gems/ocra-1.3.0/bin/ocra:1005:in
'write': Bad file descriptor - RubyRPG1-6.exe (Errno::EBADF)
from ocra:1005:in 'block in initialize'
from ocra:983:in 'open'
from ocra:983:in 'initialize'
from ocra:822:in 'new'
from ocra:822:in 'build_exe'
from ocra:1138: in 'block in (top (required))'

Then it creates a 34KB .exe with a bad file signature. This is happening no matter where I try to compile, in Downloads or Desktop, even when I run it as administrator, even when I try and compile .rbs I have previously successfully compiled .exes of. Normally I recall this error message popping up if there was already a .exe with that file name in the directory, but I make sure to delete or rename any duplicate files before I run the compiler.

OCRA is the only thing I've found that's actually completely worked for me, so this is distressing. Does anyone know what might be the problem?

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
I have a question about pattern matching. Say I have a record of integers ranging -1 to 1, so I have something like:

[-1, -1, 0, 0, 1, 0, 1, 1, 1, 0, 1, -1, 0, 0, -1...]

So then I am curious where something like [0, 1, 0, 1, 1, 1] might have shown up in my record. In this case, it matches exactly a sequence in the middle of what I showed. There might be others. I was wondering about techniques for scoring matches based on similarity instead. Say that I would want to match the same spot generally, but I have [0, 1, 0, 1, 1, 0] -- the last number is a little off. What are some techniques for doing stuff like this?

Zhentar
Sep 28, 2003

Brilliant Master Genius
That would be a Longest Common Subsequence problem. The classic approach would be a dynamic programming algorithm.

Xerophyte
Mar 17, 2008

This space intentionally left blank

Zhentar posted:

That would be a Longest Common Subsequence problem. The classic approach would be a dynamic programming algorithm.

Not necessarily, an error in the middle of the sequence would yield a very close match but not necessarily the longest common subsequence. I'd look into Levenshtein distance and related algorithms for its computation, since what you're basically looking to do is calculate the Levenshtein distance for all substrings of length n in the record and pick the closest match.

Edit: I should add that if you intend to only include substitutions then you're calculating the Hamming distance.

Xerophyte fucked around with this message at 18:13 on Jul 8, 2012

Safe and Secure!
Jun 14, 2008

OFFICIAL SA THREAD RUINER
SPRING 2013
The Wikipedia article for map says that it's an example of a homomorphism (link):

quote:

In many programming languages, map is the name of a higher-order function that applies a given function to each element of a list, returning a list of results. It is often called apply-to-all when considered in functional form. This is an example of homomorphism.

In what sense, though? Suppose we have a list of elements from some group, and we apply a function that is NOT a homomorphism to each element of the list to produce a second list. Or can the map function itself be considered a homomorphism under some operation like concatenating two lists before/after applying the map function with some arbitrary function f?

Is this just a case of Wikipedia being unreliable?

ToxicFrog
Apr 26, 2008


Safe and Secure! posted:

In what sense, though? Suppose we have a list of elements from some group, and we apply a function that is NOT a homomorphism to each element of the list to produce a second list. Or can the map function itself be considered a homomorphism under some operation like concatenating two lists before/after applying the map function with some arbitrary function f?

Disclaimer: I am not a mathematician.

My understanding is that a homomorphism is a function mapping an object to another object with the same structure - so map is homomorphic because the input and output are both lists, even if the function map applies to the individual list elements is not.

In contrast, something like reduce wouldn't be homomorphic because the output is not necessarily a list.

Someone with a better understanding of abstract algebra will probably come along and tell me how wrong I am now, but that's my best understanding of it. :)

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
You're basically right. Using SML notation, the functional type 't list is isomorphic to the free monoid on the set underlying 't; for a given function f : A -> B, then, map f is a homorphism of monoids taking A list -> B list because it preserves the monoid structure. The structure of 't is irrelevant because the free monoid operates on an arbitrary set, and every well-typed function is a "set homomorphism".

VVV Ah yes, there's that too.

rjmccall fucked around with this message at 22:08 on Jul 8, 2012

shrughes
Oct 11, 2008

(call/cc call/cc)
map is a homomorphism for arguments of type (S -> S).

For any type S, functions of type (S -> S) form a monoid under composition. That is, (f o g) o h = f o (g o h) and there's an identity function.

map : (S -> S) -> (List<S> -> List<S>), and (List<S> -> List<S>) is likewise a monoid under composition.

map f o map g = map (f o g), and map id = id. Which means that map is a homomorphism.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Edit: Never mind.

Grundulum
Feb 28, 2006

Grundulum posted:

Is there a way to change a variable's 'parameter' status from within the code? I'd like to read in a number that I will then use to allocate array sizes (easy if it doesn't have the parameter keyword), but since it's not my code I want to make sure that some line doesn't go and change that number after the input subroutine has run (hard if it isn't a parameter).

Reposting this in the hope that someone new will see it and have an answer.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
By design, PARAMETER introduces a symbolic, compile-time constant, so there's never going to be a way to set one dynamically. What you want is some sort of mutability control, forcing a particular global variable to be immutable in all contexts except one. To the limits of my knowledge, FORTRAN does not provide anything like that.

Curiously enough, C does: it is perfectly legal to declare a global variable "extern const" in all but one translation unit, and then that translation unit defines it without const. You just have to make sure that the defining translation unit doesn't see the const declaration.

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Rocko Bonaparte posted:

I have a question about pattern matching. Say I have a record of integers ranging -1 to 1, so I have something like:

[-1, -1, 0, 0, 1, 0, 1, 1, 1, 0, 1, -1, 0, 0, -1...]

So then I am curious where something like [0, 1, 0, 1, 1, 1] might have shown up in my record. In this case, it matches exactly a sequence in the middle of what I showed. There might be others. I was wondering about techniques for scoring matches based on similarity instead. Say that I would want to match the same spot generally, but I have [0, 1, 0, 1, 1, 0] -- the last number is a little off. What are some techniques for doing stuff like this?

There's also Smith Waterman

rolleyes
Nov 16, 2006

Sometimes you have to roll the hard... two?
Is there a practical way to go from the time zone name of a location to the time zone abbreviation for a location while maintaining DST awareness for locations which use it? E.g. from "Europe/London" to "BST" or, in the winter, from "Europe/London" to "GMT".

I'm thinking this is a solved problem so really don't want to have to try to reimplement it myself, but I'm also severely limited in that the tools I have available to me for this can be summed up as follows: Oracle 10g. I know, this is entirely the wrong tool for the job but this is outside of my control.

I've spent the last hour or so futzing around with Oracle's built in date/time functions and Google but haven't managed to find anything which works yet. The nearest I got was using the timezone_names virtual table, e.g. this:
code:
select * from v$timezone_names where tzname = 'Europe/London'
Returns this:
code:
TZNAME        | TZABBREV
------------------------
Europe/London | LMT
Europe/London | GMT
Europe/London | BST
Europe/London | BDST
But there's no obvious way to detect whether DST is active and, even if there was, no clear way to know which one of the abbreviations to use. Plus Oracle doesn't even consider some of them to be valid, e.g. this works:
code:
select tz_offset('BST') from dual
But this returns ORA-01882 (timezone region not found):
code:
select tz_offset('BDST') from dual
I'm getting the feeling the answer is "no, not if you're only using Oracle".

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Does anyone have a study or paper I can cite that times the cost of branch prediction vs branch misses? I'd like to be able to be more precise than "well, the branches in {this part of the system I'm working on for my thesis} will end up being marked as strongly-taken or strongly-not-taken so it's basically gonna be free I guess :confused: "

JawnV6
Jul 4, 2004

So hot ...
It's going to vary wildly between different processor families.

Even within a family different types of branch mispredicts can have different costs, e.g. if the CPU predicts based on IP address then corrects after decode, or if a prediction made at decode actually takes until the backend computes some result that changes the branch path. If you've got a very specific system in mind, you could probably do some basic analysis on your own with PMONs, check the PRM vol. 3B.

Realistically on any sane code you're getting 95% or better and branch mispredicts aren't eating up a lot of your time anyway. If you want a cite for the basic bimodal predictor just use the Yeh-Patt paper.

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Yeah, every time I've needed to talk about very architecture-specific stuff I've gone looking for Nehalem stats, since that's what we're doing our evaluation on. (for context, the ones I'm speaking of would be conditional branches.)

I'm not sure the Yeh-Patt paper is the right one for me to cite since the old Motorola chip they use almost certainly doesn't do execution reordering or is otherwise superscalar in nature. My understanding of what would happen is that, say, a branch marked strongly-taken will start executing both branches and roll changes back only after the conditional is actually checked and if it turns out it's not taken. But, I'll work forward from who cites Yeh-Patt and see what I can see. Thanks!

JawnV6
Jul 4, 2004

So hot ...

Dijkstracula posted:

My understanding of what would happen is that, say, a branch marked strongly-taken will start executing both branches

I'm only aware of one architecture that tried to do things this way (Sun's Rock). Remember that our predictors are well above 95% in most cases, why would you waste pipeline space on ops that you think have a 3% chance of actually mattering? You charge ahead on the probable path, knowing you'll have to clear the whole pipeline if you're wrong.

It really sounds like PMONs and TSC readings could get you exact data on what you need. You're having far too much fun anyway, curl up around a computer. Although chasing down every cite of Yeh-Patt probably has you doing that anyway.

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Oh interesting, I thought speculative execution was the norm now, but just executing the more likely path seems obvious in hindsight.

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
Thanks everybody for the suggestions on how to do the approximate matching I want. I'm not entirely sure what I'm going to try to work with yet. I will be setting up some testbench that first only functions with exact matches, so then I can punch in some different strategies to mix in approximate matches.

Zhentar
Sep 28, 2003

Brilliant Master Genius

Dijkstracula posted:

Oh interesting, I thought speculative execution was the norm now,

I think the whole "power" thing has pretty well put the kibosh on that. When you're TDP limited, you evaluate everything on performance/watt, and executing things that you aren't confident will actually be used won't tend to do very well in that regard.

KUBaNPhillipay
Jul 16, 2004
I'm working on a side project at work that is going to involve tracking how well our operations floor is running. My plan is to use Qt for the front end GUI and some form of SQL to store the data in a database. There will probably be 5-10 people using this program and it will generate graphs, charts, etc. based on the data. I'm still pretty new to making GUI's and have very little SQL knowledge but I'm willing to learn.

My question is, can I put an SQL database (like MySQL for example) on a network mapped drive that is mapped the same on all the computers (so for example, every computer has X:\Database) and have my program connect to it? Will this work if my users do not have any form of SQL installed on there computer? Is this a horrible idea and is there a better way I should be doing this? My reasoning for doing this is that not all the people at my office are technicality literate so I would like to make a very simple .exe file they can run, upload the files they need, and not have to worry about anything else.

Any advice would be appreciated!

Zhentar
Sep 28, 2003

Brilliant Master Genius
That's a terrible idea. Set up an actual database server and just have the application automatically connect to the server.

Scaevolus
Apr 16, 2007

Dijkstracula posted:

Does anyone have a study or paper I can cite that times the cost of branch prediction vs branch misses? I'd like to be able to be more precise than "well, the branches in {this part of the system I'm working on for my thesis} will end up being marked as strongly-taken or strongly-not-taken so it's basically gonna be free I guess :confused: "

Agner Fog has really good hard numbers for these kinds of questions.

Here's his software optimization page. You should read chapter 3 in his microarchitecture document.

ToxicFrog
Apr 26, 2008


KUBaNPhillipay posted:

I'm working on a side project at work that is going to involve tracking how well our operations floor is running. My plan is to use Qt for the front end GUI and some form of SQL to store the data in a database. There will probably be 5-10 people using this program and it will generate graphs, charts, etc. based on the data. I'm still pretty new to making GUI's and have very little SQL knowledge but I'm willing to learn.

My question is, can I put an SQL database (like MySQL for example) on a network mapped drive that is mapped the same on all the computers (so for example, every computer has X:\Database) and have my program connect to it? Will this work if my users do not have any form of SQL installed on there computer? Is this a horrible idea and is there a better way I should be doing this? My reasoning for doing this is that not all the people at my office are technicality literate so I would like to make a very simple .exe file they can run, upload the files they need, and not have to worry about anything else.

Any advice would be appreciated!

:stonk:

With most databases this isn't even possible, because the database is not necessarily a single file: it's accessed through a server. The client does not have access to the file(s) the server uses to store the database, nor should it: instead, it connects over the network, sends queries, and gets results back - or it connects indirectly, for example, through a web interface that is connected to the database.

This is what you should be doing. The end user won't need anything SQL-related installed, by the way - your program will have all of the code needed to communicate with the SQL server built in.

The exception to this is "self-contained" databases like sqlite, where the database is a single file and there is no separate server, the program just accesses it directly - but these are not safe for concurrent use by multiple processes and thus a terrible idea for your planned use.

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Scaevolus posted:

Agner Fog has really good hard numbers for these kinds of questions.

Here's his software optimization page. You should read chapter 3 in his microarchitecture document.
Yeah, I did go through Agner's microarchitecture document but I couldn't see the hard numbers I was looking for :sigh:

Adbot
ADBOT LOVES YOU

Blotto Skorzany
Nov 7, 2008

He's a PSoC, loose and runnin'
came the whisper from each lip
And he's here to do some business with
the bad ADC on his chip
bad ADC on his chiiiiip

Scaevolus posted:

Agner Fog has really good hard numbers for these kinds of questions.

He also has some libraries that you can hook into your code to get the TSC and PMON numbers that would allow you to quantify your code's performance yourself


e: here is a project that I was tangentially involved with that wrapped these libs into a VS plugin http://rcos.rpi.edu/projects/timing-framework/

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply