Wildcards – Computerphile


So wildcards I mean these the classic example of this is you’re you wanted to list files And you sale to find all the Word documents on my system So if you add a command line prompt on Windows you would say something like dir Star dot it’s dogs or docx same sort of thing works in UNIX you can do LS dot dot Docx, and then the output you’ll get there is a list of all the files that end with the extension dot doc X So what’s going on here? How does this actually work? We generally talk about wildcards when we’re thinking about files and things but Similar things are used in other places all I talk about in files with the same approach can be used elsewhere So what a wild-card is is just the idea that certain characters rather than being compared? Literally against the actual file name or the directory name our interpreters having a special meaning so the the obvious one is star Which rings? Anything we tend to think about it as anything so what we’re saying here is show me all the files Dir LS depending on using DOS or Linux all the files is anything dot docx, so what’s actually going on here well Let’s first of all think about the different wildcards we can get because there are various things the two main ones you come across are star, which means match anything and you get question mark which means match any character so for example if We wanted to find all the letters to mothers then we might do a search for M question while m dot doc X that would then find mum Doc X and it would also find mom doc XE would also find mom docx and You’ll find m 3 m dot dot so it’ll match any character instead of The question mark so it’s basically saying at this point I’m not going to specify what’s here you match any character there’s a you could also combine this so you can search for M question mark question mark m Dot docx which of course won’t match any of these because they have two characters there But would match for example Dot dot X like that so you’d match something like that because it’s two there and again This could be M 1 for m.

Dot dot X or if you’re hacking you might have dogs dogs X So what does this actually match what about star then so start means match? anything and This way actually varies slightly between Windows and UNIX or something between us and UNIX so for example let’s say we have pre and then we match that to star Dot docx and so this would match Prefix dot docx it would also match prepare Dot docx etc so anything that begins with P re It’ll match there. You could do star IX dot docx which was then match suffix And of course UNIX dot because this is mean anything that we match here and Then we will find IX dot first Iseman any run of characters from zero or more this would also match of course IX dot doc and this would have matched free dot doc and things because it’s basically zero more any characters That we like you can get slightly more advanced ones so UNIX allow you to do something let’s have a look you can say a dash Z in square brackets File dot doc And this is getting UNIX would allow you to match a file Doc and even might as you might t file dot doc But it wouldn’t match three file Dot doc because what we said here is it match any character this between upper case and upper case said so this is a lower Case B. So this wouldn’t match whereas these would so that’s basically what wildcards our List of files that wasn’t literally going through every file going Yeah, so if you think back we talked about if you watch the video on zero size files.

We talked about how Computer has a directory of all the files in it, so let’s just create a simple directory and let say we have Let’s just say we have lecture zero one lecture zero two We have lab to see a one you can tell what I’ve been doing this week lab zero two and we’ve got a file called notes In there and so these are all separate files, and these are in that catalog for the current directory So let’s say I wanted to match all the lectures well, then I could use a wild card of lecture for star to match all the lectures, so how or that should be matched against each catalog well, that’s Let’s first of all let’s see how it would do it if we just specified an actual file name So what would happen if we wanted to search for lab01 because it’s a bit further down so we want to search for the file Lab01 what actually happens? Well, so it’s got a fine Whether that’s actually your file in the directory or not and you have You have a series of API calls and the operators allow you to search through deities, so what’s going to happen We’re going to come to the first file in the directory So we’ve got lecture 0 1 and we’ve got Lab and once we’re looking for the file lab 1 and the computer is going to compare the two character so it’s going to compare L&L and they match and It’s then going to compare ena they don’t match so we know that it doesn’t match this one We don’t do the same for lecture 2 and begin the same way we get to the same point we then come to compare lab01 And lab 0 once we’re looking for fire lab 0 1 will come to Catalog entry table 0 1 so we compare the first two characters they match We compare the second two characters. They match we compare third two characters They match we compare last, but one characters And we compare the last two characters and they match and so we found the file it would actually then keep going and compare We’ve left 0 2, and they would not match on the last character And so it wouldn’t match and the same four notes And then we would finish so what the first thing we do is we check every file in the catalog each time we do it And you could probably build up the systems and in place to make you faster Whether they do that another is an implementation issue we won’t go into that So how do we handle the wild-card? Well, let us think about traditional window style where we just mean anything that follows will match with the star So let’s say we’re going to look for lecture again And we’ll do the same thing to researching the catalogue through at lecture zero one and we’re going to compare this with Lecture star so we compare the first two They match compare the second two they match compare the third two they match and so on Until we get to this point they’ve all matched if any of these didn’t match Then we’d stop because they don’t match just like we did before They’re now in the windows das world star means anything after here matches So since we get to this point, and there’s a star here.

We say well. Okay doesn’t matter what that is We’ve matched though this is a big tick we can use this file do the same with lecture two and of course the same thing happens with the Others when we get there It doesn’t match and it carries on so it builds up a list of results Or you have things that you can go through to iterate through them And so on you get the first and you get the next and the next and the next until you come to the end Interesting point is where is this actually done in the system and that depends on the musings of using a UNIX system This is often done by the shell So actually when the program gets called it actually gets called with each of the different files as different arguments passed into the UNIX Executable to say we’re having to do it itself so of course you can see how this would then generalize with say The question when we come to the question what let’s say we’re searching for files match Lecture zero and then some numbers are like zero four well.

Let’s say we use the pattern lecture zero question mark We compare each of these just as we did before they all match now when we get to this point We see that This is a question mark and we write the code is it so question mark that matches with anything of course this wouldn’t match with the file lecture file name lecture zero Because we’d have the question mark here and this would be the end of string and so wouldn’t there so we just have to write The string comparison algorithm to know about question marks and to know about stars and the other Character codes that we can use to represent other things First glance it looks plausible when you actually read into the comments.

They’re very bizarre Let’s pick a couple and see us I so so this guy by blocky thought now that might be a real person But it probably is just a completely made-up username I was know pretty puzzling the code gaming so they would have found out something tread the largest stop information I filled this guy hassled 200 as replied I find your own difference profound certainly, haha


Please enter your comment!
Please enter your name here