Captcha1 | the Missing Lake

to get the flag we have to solve 300 captchas, each captcha is an image that contains text we have to extract the text from the image correctly and submit it if it's valid we have solved 1 captcha

we cannot do this manually so what we have to do is create a python script to automate the work done.

first we have to see the request done when we open the page and get new captcha image

in the response we have an encoded image embedded within the HTML response as a data:image/png;base64, so we'll need to extract and decode it from the HTML content using a script to save it as an image file.

In this script:

  1. We send a GET request to the URL

  2. We use a regular expression to extract the base64-encoded image data from the HTML response.

  3. We decode the base64-encoded image data and save it as an image file named "output.png" using the Python Imaging Library (PIL).

  4. We use the pytesseract library to perform OCR on the image obtained from the HTML response.

  5. The image_to_string function is used to extract text from the image.

  6. The extracted text is then printed to the console.

Make sure you have the requests and Pillow (PIL) libraries installed in your Python environment:

pip install requests requests
pip install requests pillow
pip install pytesseract
sudo apt install tesseract-ocr
import re
import base64
from PIL import Image
from io import BytesIO
import requests
import pytesseract

# Send a GET request to the URL
url = "https://captcha1.uctf.ir"
headers = {
    "Cookie": "PHPSESSID=p0ubbj7gbpic6lmjlg251mtvhg; 926835342a210d84823968c8328cc3c8=7991ca4ed06637bb85462f16f7903c15",
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0",
}
response = requests.get(url, headers=headers)

# Extract the base64-encoded image from the HTML response
html_content = response.text
match = re.search(r'data:image/png;base64,([^"]+)', html_content)
if match:
    base64_image = match.group(1)

    # Decode the base64-encoded image
    image_data = base64.b64decode(base64_image)
    image = Image.open(BytesIO(image_data))

    # Save the image to a file
    image.save("output.png")
    print("Image saved as 'output.png'")

    # Perform OCR on the image to extract text
    extracted_text = pytesseract.image_to_string(image)

    print("Extracted Text:")
    print(extracted_text)
else:
    print("Image not found in the HTML response")

after we get the extracted text from the image we have to send a POST Request where we send the captcha value to the server

modified script to send the extracted text as the captcha value in a POST request

import re
import base64
from PIL import Image
from io import BytesIO
import requests
import pytesseract

# Send a GET request to the URL
url = "https://captcha1.uctf.ir"
headers = {
    "Cookie": "PHPSESSID=p0ubbj7gbpic6lmjlg251mtvhg; 926835342a210d84823968c8328cc3c8=7991ca4ed06637bb85462f16f7903c15",
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0",
}
response = requests.get(url, headers=headers)

# Extract the base64-encoded image from the HTML response
html_content = response.text
match = re.search(r'data:image/png;base64,([^"]+)', html_content)
if match:
    base64_image = match.group(1)

    # Decode the base64-encoded image
    image_data = base64.b64decode(base64_image)
    image = Image.open(BytesIO(image_data))

    # Save the image to a file
    image.save("output.png")
    print("Image saved as 'output.png'")

    # Perform OCR on the image to extract text
    extracted_text = pytesseract.image_to_string(image)

    print("Extracted Text:")
    print(extracted_text)

    # Send a POST request with the extracted text as the captcha value
    captcha_value = extracted_text.strip()  # Remove leading/trailing whitespace
    post_url = "https://captcha1.uctf.ir"  # Replace with the actual POST URL
    post_data = {"captcha": captcha_value}
    post_response = requests.post(post_url, headers=headers, data=post_data)

    print("POST Response:")
    print(post_response.text)
else:
    print("Image not found in the HTML response")

to solve 300 captchas by sending captcha values, we can modify the script to keep track of the number of captchas solved and continue the loop until we reach 300.

This script will continue sending captcha values and counting the number of captchas solved until it reaches the goal of 300. It will print the number of captchas solved as it progresses and a final message when 300 captchas are solved.

import re
import base64
from PIL import Image
from io import BytesIO
import requests
import pytesseract

# Define the URL for getting captcha images and submitting captchas
base_url = "https://captcha1.uctf.ir"
get_url = f"{base_url}/"
post_url = f"{base_url}/"

# Headers for the requests
headers = {
    "Cookie": "PHPSESSID=p0ubbj7gbpic6lmjlg251mtvhg; 926835342a210d84823968c8328cc3c8=7991ca4ed06637bb85462f16f7903c15",
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0",
}

# Initialize variables to keep track of captchas solved
captchas_solved = 0

# Continue sending captchas until 300 are solved
while captchas_solved < 300:
    # Send a GET request to the URL to retrieve a new captcha image
    response = requests.get(get_url, headers=headers)

    # Extract the base64-encoded image from the HTML response
    html_content = response.text
    match = re.search(r'data:image/png;base64,([^"]+)', html_content)
    if match:
        base64_image = match.group(1)

        # Decode the base64-encoded image
        image_data = base64.b64decode(base64_image)
        image = Image.open(BytesIO(image_data))

        # Perform OCR on the image to extract text
        extracted_text = pytesseract.image_to_string(image).strip()  # Remove leading/trailing whitespace

        print("Extracted Text:")
        print(extracted_text)

        # Send a POST request with the extracted text as the captcha value
        post_data = {"captcha": extracted_text}
        post_response = requests.post(post_url, headers=headers, data=post_data)

        # Check if the response contains "that ain't right"
        if "that ain't right" not in post_response.text:
            captchas_solved += 1
            print(f"Captchas Solved: {captchas_solved}")

    else:
        print("Image not found in the HTML response")

print("Solved 300 captchas!")

now let's run this script

and when the script finishes

and we have got the flag

Flag

UCTF{7h3_m1551n6_l4k3}

Last updated

Was this helpful?