Hello,
Some of the solutions provided in the exercises are not correct:
For this exercise (n. 10)
The solution provided is egrep -o ‘\b[A-Z][a-z]{2, }\b’ /etc/nsswitch.conf > /home/bob/filtered1
because it says "Filter out the lines that contain any word that starts with a capital letter and are then followed by exactly two lowercase letters
So why are there:
\b - what does that mean?
{2, } is not correct, because that means minimum 2 and max unlimited (or at least 2). But the exercise says EXACTLY 2 lowercase. Shouldn’t that be {2} ?
Thanks
-o does not “filter out the lines”, it actually filters out only the matching substrings.
\b is a “word boundary” (see e.g. Regular expression - Wikipedia)
Regarding {2} - yes, you’re probably right (almost, I think)
I believe \b[A-Z][a-z]{2}\b would be even more appropriate (at least for Latin / ASCII alphabet) if we are to extract only complete, 3-character words.
(but \b[[:upper:]][[:lower:]]{2}\b if you asked me about any other alphabets with diacritics).
So it seems that using -o + \b is like using ** grep -w** ?
I matches just a word, not a line.
Anyway, the solution does not seem to be correct, as the exercise asks to filter out the lines, not the just the words (which is what the solution provided instead does).
Well, not exactly. As -w will match a word, it will still display the entire line with this matching word, whereas -o will only display the matching substring (“word” itself), and it will not display the entire line containing it.