regex

in LeoFinance11 days ago (edited)

regex-javascript-regular-expression.png

Taking the plunge.

During my coding adventures of the past, Regular Expressions are one of those things that I refused to touch with a ten foot pole. I'd see the crazy nonsensical expression and be like, "Nope, not figuring that out today. Nobody needs that!" Turns out it's pretty useful once you actually figure it out.

image.png

regex is a search parameter

Is your program scanning a document and you need to find specific pieces of info within that document? This is the main use-case. The purpose of regex is to provide a search function with a very specific and powerful way of finding exactly the text we are looking for.

So why did I choose now of all times to figure this out?

I started playing a very difficult multiuser dungeon called HackMUD with the intention of getting back into the coding game. The game is built using Javascript and Mongo Database (an extremely javascript-friendly object-oriented database). In order to play the game the player has to create their own scripts to interact with the command-line and database. So far the scripts I've created are:

  • A cracker for breaking low-level (Tier 1) locks.
  • A garbage disposal for throwing away junk items.
  • A scraper that hacks corporate frontends and finds weak targets to attack.

All of these things can be done manually on the CLI (command-line) but as the player progresses through the game the point is to automate all the tedious bits so they no longer have to be done manually. On a technical level it's very easy to play the game without personally needing to code because other players make their code public and you can run their scripts. However if you run the wrong script you can lose all your money (and sometimes all your items/upgrades as well). Seeing as the entire point was to get back into coding I'm creating my own scripts from scratch and the process has been interesting (and very very slow). I'm also getting a lot of ideas for a game I wanted to create on Hive called Net Ninja. No promises.


image.png

So where does regex come in?

It all started when I saw this in someone else's hackMUD script:

.match(/`N(\w+)`.*$/)

OMG what does it mean though?

Luckily there are really good regex websites that will tell you exactly what the thing does.

image.png

https://regex101.com/

And when I first took a look at this... the website told me exactly what it did... but it still made no sense within the context of the game. What it's doing here is looking for a pair of graves/backticks with the letter N after the first grave. Between the `N` it's looking for a word consisting of alphanumeric characters which is what \w+ means. \w matches any alphanumeric character (including _ underscore apparently) and the + repeats that until it stops finding them. Then the .*$ denotes the end of the string, meaning that this regex can only match once and it will match the one closest to the back.

As if it wasn't confusing enough.

The reason why this was confusing to me within the context of HackMUD is that backticks pairs don't exist. They are invisible on the frontend because they exist as color codes.

image.png

image.png

https://hackmud.fandom.com/wiki/Colors

Hm... yep. So the backticks are invisible because they denote a certain color code. Jumped right into the deep end with the confusion on that one. The black box of HackMUD is quite unforgiving... but when you make something that actually works it's pretty fantastic. Not that I would recommend this game to anyone... I almost quit myself (a few times) because of how mind-numbingly difficult it is. Luckily the discord is very active and the entire point and difficulty of the game revolves around the multiplayer aspect.

So it turned out that the `N(word)` regex being searched for is the color code cyan (#00FFFF), which unsurprisingly turns out to be one of the most important colors in the game because it's the color of all locks and arguments given to dictionary keys. This particular regex was finding the type of lock that needs to be cracked. Once I finally understood what it did was I able to incorporate it into my own scripts.

Creating my own regex

password = password.match(/strategy\s(\w+)/)

The corporations in HackMUD are very dumb and they tell you the password to their employees list (hackable targets) from the homepage frontend.

image.png


Some might consider this a spoiler but I don't know the game is so hard and no on here is going to play it so whatever. Navigating to the "about us" page (in this case "info") seems to always provide the password after "We are calling this strategy: ******" The strategy is always the password so I wrote the regex /strategy\s(\w+)/ to grab it. The code finds "strategy" then looks for whitespace \s then grabs the word after it with (\w+) just like in the previous example. Looks like I'm learning regex. If I'm being honest it's almost embarrassing it took me this long and required a faux hacking game to get it done.

Conclusion

Regex is a very powerful (and often confusing) tool for advanced test searching. Seeing as Hive itself is just one big lump of text I'm glad that I'm starting to get a feel for how these searches work. Ultimately I'd like to build a basic game that helps our users learn Javascript and get rewarded while doing it. Maybe I'll even create a Game Design Document for Net Ninja in a post sometime. Dream Big!

Sort:  

Regex it is so powerfull i always used it to verify an input in a web form, some partners hate it because they cant spend a few hours to learn well but that is not my problem, in conclusion: if you can fix something with native code like regex without install any framework why you will suggest that we need install a framework; YOU MUST LEARN TEGEX DONT BE A LAZY :D

Let's test you skills, what does this do?

mystery_regex = (
    r"^(?!string$)(?=.{3,16}$)[a-z]([0-9a-z]|[0-9a-z\-](?=[0-9a-z]))"
    r"{2,}([\.](?=[a-z][0-9a-z\-][0-9a-z\-])[a-z]([0-9a-z]"
    r"|[0-9a-z\-](?=[0-9a-z])){1,}){0,}$"
)

"This regular expression (mystery_regex) appears to be a pattern designed to match valid domain names, following the rules and constraints set for domain names. Let's break it down step by step:

  1. Length of the string: (?=.{3,16}$): This is a positive lookahead assertion that ensures the entire string has a length between 3 and 16 characters (inclusive).

  2. First character must be a lowercase letter: [a-z]: The first character should be a lowercase letter.

  3. Following characters:

    • ([0-9a-z]: The second character should be either a digit or a lowercase letter.
    • |[0-9a-z\-](?=[0-9a-z]): Or, a digit, a lowercase letter, or a hyphen (-), followed by either a digit or a lowercase letter.
    • ){2,}: Repeat the above pattern at least twice.
  4. Optional subdomains:

    • ([\.](?=[a-z][0-9a-z\-][0-9a-z\-])[a-z]([0-9a-z]|[0-9a-z\-](?=[0-9a-z])){1,}): This matches a period (\.) followed by a lowercase letter and two characters, each of which can be a digit, a lowercase letter, or a hyphen (-). This pattern must occur at least once.
    • {0,}$: This makes the entire subdomain part optional.

In summary, this regex pattern is designed to match valid domain names, allowing for subdomains and enforcing specific rules for the characters that can be used in each part of the domain name."

Regards
Pi.ai

Hark, the evil one has returned to bless us with his black magic.

Here's a clue (from Co-Pilot)

The mystery_regex is a regular expression pattern used to validate a string. Let's break it down:

^(?!string$): This is a negative lookahead assertion that checks if the string is not exactly "string". If the string is exactly "string", it will not match the pattern.

(?=.{3,16}$): This is a positive lookahead assertion that checks if the string length is between 3 and 16 characters. If it's not, the string will not match the pattern.

[a-z]: This checks that the first character is a lowercase letter.

([0-9a-z]|[0-9a-z\-](?=[0-9a-z])){2,}: This checks that the following characters (at least 2) are either a lowercase letter, a number, or a hyphen, but a hyphen must be followed by a lowercase letter or a number.

([\.](?=[a-z][0-9a-z\-][0-9a-z\-])[a-z]([0-9a-z]|[0-9a-z\-](?=[0-9a-z])){1,}){0,}$: This checks that if there is a period in the string, it must be followed by a sequence that starts with a lowercase letter, followed by at least one character that is either a lowercase letter, a number, or a hyphen (but a hyphen must be followed by a lowercase letter or a number). There can be zero or more such sequences in the string.

In summary, this regular expression pattern is used to validate a string that:

Is not exactly "string"
Is between 3 and 16 characters long
Starts with a lowercase letter
Is followed by at least two characters that are either a lowercase letter, a number, or a hyphen (but a hyphen must be followed by a lowercase letter or a number)
If it contains a period, the period must be followed by a sequence that starts with a lowercase letter, followed by at least one character that is either a lowercase letter, a number, or a hyphen (but a hyphen must be followed by a lowercase letter or a number). There can be zero or more such sequences in the string.

Unsurprisingly it got a bit easier when I was told that r"" isn't part of the regex.

Yeah that's just a python syntax wrapper on the thing.

This one is good but let's see if you can figure out this one:

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
Never gonna make you cry
Never gonna say goodbye
Never gonna tell a lie and hurt you

Alright I actually went through it a bit just for the practice and it seems like a huge troll.

  • Right off the bat you don't give me the global modifiers which is a problem.
  • The first line has ^ which is the beginning of the line, but the beginning of the line starts with r". Even just typing r"^ into regex I couldn't figure out how to match it. Troll.
  • Then in the second line you're doing a lookahead of 3 alphanumeric characters but then when the lookahead retraces you're looking for 2 alphanumeric characters and a ". The third character can't be " because it was already confirmed to be not ". Again troll.

Because both the first and second lines don't even make sense and I can't match them this is where I quit. Although I thank you for showing me what positive and negative lookups are that is quite useful.

Can't wait for you to tell me how we're supposed to match r" and have that occur before the beginning of a string.

image.png

The guys in the hackMUD discord pointed out that r"" means raw and is probably a python script of 3 regex values in a tuple.

This is very troll sir. :D

I skipped some of the post because I didn't want to spoil the game. Looks pretty cool. I'm a dual major in school. I combined neuroscience and computer science. CS has made me start to hate computers. I bet with the stuff I have learned in class it would still be difficult, but it looks fun.

Worth the price tag?

$20 seems like a lot for a game that runs on a command line console but I've gotten more than my money's worth. At the same time it's an MMO so a flat fee to play and the ability to use their Mongo DB is pretty nice. I would not be coding right now if I hadn't picked up this game so I'm hoping to tap into that momentum.

I see the aspiration for games, but are you thinking you'd like to build it on Hive, or just use Hive like a currency. I've found some issues in looking for adequate smart contract ability from sidechains.

I think building it on Hive would work quite nicely because there needs to be a cost to do things.
But that cost also needs to be very low.
Resource credits would work quite nicely for that.
There are also other advantages like memo key encrypted chatrooms and the ability to leverage the security of the entire chain to make sure no one can cheat "not even the devs"

Maybe I'm not creative enough, but there always seems to be one key piece of functionality missing prior to building an application. I have something big I want to include the Hive community in on, but it's hard to imagine it working without certain features. One thing can be certain man, blockchain is so niche in academia (mainly on the depth of creativity) that most people here could be published authors. (I published essentially a whitepaper.) I think there could be a GIANT use that hasn't been implemented correctly, but I need that deterministic algorithm type stuff that I get from saving state in an Ethereum Virtual Machine based system. Free time/motivation is also a limiting factor for me.

If I find anything worth using, I'll share it with you.

Good luck to both of us.

More infrastructure certainly would be nice, but at the end of the day any project can just be centralized to a single node and database which greatly reduces complexity. Most EVM projects are so heinously centralized with the dev team able to mint any coin or change any contract that it doesn't even matter that they're building on a decentralized chain.

No, I completely agree with that logic. It all just takes a little changing of the code.

Also it takes over 5 hours to get out of vLAN.
I think it may have taken me 8 hours.
vLAN is the single player tutorial that gets you ready to play the real sandbox.
You must "prove your sentience".

Damn. I wonder how long the gameplay is in total. I know it would vary. (Especially for someone like me who misses an extra comma or something benign for like 3 days.)

Well it's a puzzle game and a sandbox and an MMO so I'm not really sure if you can win.
The puzzles are crazy hard and take totally random amounts of time to crack.
The sandbox lets you go off and do random stuff.
A system can only sys.init 4 times (level 0,1,2,3,4).
I've been playing a while and am barely level 2 after automating T1 stuff.
The game progress comes in bursts you'll be stuck on something and make a breakthrough.
The learning curve walls are very high.

Sounds like an interesting game, I might give it a go, do you store your scripts in the game somehow or do you keep the offline. just wondering how you can do software development / testing / debugging from a game

Loading...

That's a really cool concept. If they had something like that for Python I might actually look into it. I have been trying to learn to code in Python a bit for a while now. I've done a couple of things with BEEM, but not enough to consider myself proficient at it.

Yeah I haven't used Python in over a decade and I keep trying to do stuff in Javascript that Python can do and it throws and error lol. Python is pretty great and if my game actually got traction I think it would definitely be on the top of the list to incorporate. The problem is that Javascript is amazing because it can ping nodes and interact directly with websites and HTML. Python is better at scaling and object oriented stuff.

I've often wondered if I should shift my focus to Java, but I feel like Python is just an easier reentry for me into coding. Last time I coded was 25 years ago in college and it was C++ and COBOL.

It's funny that you mention COBOL in your comment. It made the news for a few weeks in 2020h1 because many state-level Departments of Labor needed their COBOL systems updated to handle the influx of people put on unemployment due to the forced layoffs caused by state-level pandemic policies.

As much as we want to think that Python and pHp and Ruby on Rails and JavaScript are a must for programming systems in the 21st Century, the astounding quantities of COBOL systems used by governments and banks ensure that COBOL will be needed far into the future if for no other reason than program maintenance. Like World War 2 veterans, COBOL programmers are at fewer and fewer each day, and they will be paid a premium just to keep the systyems up and running.

Whatever species succeeds homo sapiens as travelers through space and time will discover that devices used for warp drive and stargate travel have some COBOL at their most fundamental level of functionality.

Unfortunately I was never that good at coding or debugging code, otherwise I might have taken that path. I remember it was one of the less annoying languages I had to learn.

java javascript ham hamster.png

It makes me cringe to hear you say Java to mean JavaScript :D

Python is great but even Hive uses CUSTOM JSON which is native javascript.
Wanna do anything on a website? Need Javascript.

I agree that Python is a better place to start but Javascript is also much more basic and easy to deal with compared to C++ or other similar languages. All scripting languages have increased simplicity. Javascript has a crazy amount of power behind it. At the same time if you learn Python a lot of that knowledge is going to translate to Javascript.

Ah yeah, sorry for that. Java is dead and should have died a lot sooner than it did. I'm so glad I don't have to deal with it anymore with my end users. I stopped supporting it as soon as they announced they were shutting it. If the resource couldn't shift to HTML5 then they needed to find something else.

Yes, regex is a pain. But it kind of beats the alternative of typing everything out. "Match any word of three letters or more at the beginning of a line as long as the next word starts with a number between..." But you can type out what you want into ChatGPT and it does a pretty good job spitting out valid regex.

Looking at all of these codings, it’s making me feel like you’re writing in Arabic cos a novice like me won’t understand😅

Congratulations @edicted! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

You received more than 160000 HP as payout for your posts, comments and curation.
Your next payout target is 162000 HP.
The unit is Hive Power equivalent because post and comment rewards can be split into HP and HBD

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

Check out our last posts:

Our Hive Power Delegations to the April PUM Winners