[Question] Doing an advanced kind of find and replace?

So I have a file path like this, which takes you to the folder containing all the images for whatever code 4000 stands for:

C:\MainProject\ProjectImages\Images4000\
An image link inside of this is then
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0035.jp2

These image links are in a list in Excel, not each image, but the ones that represent the first page in a record (they are all digital images of microfilmed documents).

So my list might be:
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0001.jp2
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0005.jp2
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0035.jp2
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0042.jp2
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0049.jp2
etc.

This goes for about 250 lines give or take the particular ImagesWZYX folder.

However, it might be the case that I need to convert them to URLs, and the URLs have a consistent structure, but I don't know if I can do an easy find and replace...
The URL would look like
https://thesiteweuse.org/stream/specialproject4000gaba#page/n1/mode/1up
https://thesiteweuse.org/stream/specialproject4000gaba#page/n5/mode/1up
https://thesiteweuse.org/stream/specialproject4000gaba#page/n35/mode/1up
https://thesiteweuse.org/stream/specialproject4000gaba#page/n42/mode/1up
https://thesiteweuse.org/stream/specialproject4000gaba#page/n49/mode/1up

What I am looking for is a way to do a find and replace of the C:\... blahblahblah to the URL versions. You can see that the page numbers in the URL (n1, etc) correspond to the image number (0001.jp2)

Is there a way to do this? Am I just in a dream world of too much hope?
 
Since it's in excel, I just use excel itself to rewrite them for me. I do it across several columns so I can go through each step and see the results, rather than doing it all at once in one big equation.

I'll do the example you presented above and give you the equations in the following posts.[DOUBLEPOST=1378838257,1378837781][/DOUBLEPOST]So in the first row, first column you have something like:
C:\MainProject\ProjectImages\Images4000\specialproject4000gaba_0001.jp2

In the cell to the right of it use this equation to get just the filename:
=MID(A1,FIND("0\",A1)+2,100)

It looks for "0\" but if that changes you'll need to search for something different. You could also use:
=MID(A1,FIND("specialproject",A1),100)

so the above two equations provide:
specialproject4000gaba_0001.jp2[DOUBLEPOST=1378838723][/DOUBLEPOST]In the next cell to the right, use the equation:
=LEFT(B1,FIND("_",B1)-1)

Which will remove the index and file type, giving you this:
specialproject4000gaba

then in the following cell to the right, let's extract the index. Now I'm going to assume that the format is always NNNN.jp2, but if it's different elsewhere obviously the equation will need to change:
=VALUE(LEFT(RIGHT(B1,8),4))

This looks at the full file name, grabs the last 8 characters, then grabs the first 4 characters of that substring, then converts it to a number. Of course you could split this up into several steps as well if it's easier to understand.

It returns:
1

Now you have the two parts you need to create your web links.[DOUBLEPOST=1378838864][/DOUBLEPOST]The next cell to the right wraps it all up using string concatenation:
="https://thesiteweuse.org/stream/" & C1 & "#page/n" & D1 & "/mode/1up"

Which gives the result of:
https://thesiteweuse.org/stream/specialproject4000gaba#page/n1/mode/1up[DOUBLEPOST=1378839094][/DOUBLEPOST]Copy the equations down the rows to cover all your input, and you've got a row with correct URLs. You may want to copy them, then paste them into a text editor, or back into excel using "paste values" so you get the final URLs, rather than equations. It took one colum for the original string, and 4 columns with equations to get the file name, get just the project name and index, and then combine them into the final URL.

I used the following text manipulation functions in the process:

MID
FIND
LEFT
RIGHT
VALUE

These are well documented and there are a ton of online tutorials if you end up needing to alter any of the above. While I could write a program to do this for me, I find that this is often just as quick and gives me instant gratification without having to debug so much.
 
@stienman I want to hug you so hard right now.[DOUBLEPOST=1378840179,1378840068][/DOUBLEPOST]Is it possible to make that one giant equation? I don't need to. I'm just curious.
 
Yep. Remember that each equation references one or more other cells, which are just equations, so you can simply embed them in each other:

="https://thesiteweuse.org/stream/" & C1 & "#page/n" & D1 & "/mode/1up"

References C1 and D1, so let's put the equations from C1 and D1 in there:

="https://thesiteweuse.org/stream/" & LEFT(B1,FIND("_",B1)-1) & "#page/n" & VALUE(LEFT(RIGHT(B1,8),4)) & "/mode/1up"

C1 and D1 referenced B1, so let's put B1's equation in there:

="https://thesiteweuse.org/stream/" & LEFT(MID(A1,FIND("0\",A1)+2,100),FIND("_",MID(A1,FIND("0\",A1)+2,100))-1) & "#page/n" & VALUE(LEFT(RIGHT(MID(A1,FIND("0\",A1)+2,100),8),4)) & "/mode/1up"

It's a pretty lengthy equation, and it's easy to get lost on the parenthesis counts. The real issue, though, is the duplication of B1 - by doing it in stages you can use previous results multiple times, and if you have to alter one later the changes cascade, rather than finding the three or more places in the big equation that you used the same chunk of text. For instance you might run across another project which has the same end format, but the beginning format isn't similar enough - we now have to edit three parts (the MID(A1,FIND("0\",A1)+2,100)) inside that big equation, and if one of them is wrong you won't be able to easily tell. So step by step equations are very useful for this sort of work.

Also, HUG ACCEPTED!

:minionhappy:
 
Sure looks easier than piping the text output through grep and then recapturing all the output into another text file, which is what I thought I was going to see here.

I would like to add that a pivotal step in the understanding of programming comes when you learn to recognize that results are just more data waiting to be manipulated by other routines, rather than some sort of "end."

--Patrick
 
Top