Learning about visitors to my website

After creating a hidden service to host this website, I was curious about what kind of information I could obtain about visitors to that site and also the clearnet version. I was familiar with the concept of browser fingerprinting and ChatGPT was able to provide some JavaScript code to access some basic information.

const browserInfo = {
    userAgent: navigator.userAgent,
    language: navigator.language,
    platform: navigator.platform,
    screenResolution: `${window.screen.width}x${window.screen.height}`,
    viewportSize: `${window.innerWidth}x${window.innerHeight}`
};
                        

ChatGPT was also able to provide some basic information about obtaining a visitor's IP address, but this would require the visitor consenting to providing their location. Most people are familiar with seeing these requests in their browser as they visit sites.

In the context of a hidden service, a visitor's IP address will correspond to the Tor exit node currently used by their relay, and so wouldn't be useful for learning about that visitor. I would still find this information interesting, and would certainly like to know where my website traffic is originating from on the clearnet.

if (navigator.geolocation) {
    navigator.geolocation.getCurrentPosition(
        position => {
            console.log(`Latitude: ${position.coords.latitude}`);
            console.log(`Longitude: ${position.coords.longitude}`);
        },
        error => {
            console.error('Error obtaining geolocation:', error);
        }
    );
} else {
    console.error('Geolocation is not supported by this browser.');
}     
                        


This isn't very stealthy, though, and most users of the Tor Browser likely wouldn't consent to share their geolocation. Additionally, I never grant permission to sites to use my location, mostly out of habit. I found a third-party API called IP API that will provide location based information about site visitors. Here is a link to the JSON output of this API (clicking the link will reveal what this API can determine about you). Fairly extensive! This was just what I was looking for.

I added some JavaScript to my website running on localhost and also spun up a web server I wrote using Go to receive the information obtained by this JavaScript code. This Go web server is discussed towards the bottom of this page. It is important to acknowledge that this implementation won't actually work in the wild, as it uses the fetch() method to make a GET request to an external domain and a POST request to a different port on the same domain. This violates Cross-Origin Resource Sharing (CORS) policy enforced by most browsers.

    function collectUserData() {
        const userData = {
            userAgent: navigator.userAgent,
            screen: {
                width: window.screen.width,
                height: window.screen.height
            },
            language: navigator.language || navigator.userLanguage,
            timezone: Intl.DateTimeFormat().resolvedOptions().timeZone,
            referrer: document.referrer,
            date: new Date().toISOString() 
        };
        return userData;
    }

    async function fetchIPData() {
        try {
            const response = await fetch('https://ipapi.co/json/');
            const data = await response.json();
            return {
                ip: data.ip,
                city: data.city,
                region: data.region,
                country: data.country_name,
                latitude: data.latitude,
                longitude: data.longitude
            };
        } catch (error) {
            return {};
        }
    }

    async function sendDataToServer() {
        const userData = collectUserData();
        const ipData = await fetchIPData();

        const fullData = {
            userData: userData,
            ipData: ipData
        };

        // Post to Go-Creep web server, discussed below
        fetch('http://127.0.0.1:4141', { 
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: JSON.stringify(fullData)
        })
        .then(response => response.json())
        .then(data => console.log('Success:', data))
        .catch((error) => console.error('Error:', error));
    }

    sendDataToServer();
                            
                        

The image below shows the console error resulting from the fetch() request to my Go-Creep server:

image of console output

request headers

The error message shown in the console states that the request was blocked because no Access-Control-Allow-Origin header was set. This puzzled me, as the GET request to a third-party domain did succeed on some of the browsers I tested, but the request to a different port on the same localhost domain was always blocked.

console output


Javascript runs client side, on someone's device inside of a web browser. Modern browsers attempt to protect their users from network activity that could potentially be dangerous such as Cross Site Request Forgery, Cross Site Scripting, Clickjacking, and a myriad of other security and privacy concerns.

However there are many legitimate reasons why a website would want to perform client-side requests to external domains, the best example would be Content Delivery Networks (CDNs) that are commonly used on sites that serve content popular across the web such as news sites. CDNs allow static content like photos, videos, fonts, and even popular front-end libraries like Bootstrap to be widely distributed across the internet from a centralized source. Websites can be configured to allow traffic to and from sources like CDNs or third-party APIs, and in fact I could configure a CORS policy on my website to allow the behavior intended above. There is a better solution, though.

Avoiding (some) CORS restrictions using server-side logic

We won't be able to avoid CORS restrictions completely, because JavaScript is still necessary to obtain a browser fingerprint involving a visitor's screen dimensions, language, timezone, referral source, and user-agent, and we still need to call fetch() to send this our server. It will be necessary to set headers allowing the request to a different port on the same domain.

I've created a webserver using Go that will configure the CORS policy by setting the appropriate headers, receive browser fingerprinting information, and obtain IP address information directly from the requests:



  • I am using a Go module cors to set the necessary headers on the server that will receive user information.
  • I haven't discussed rate limiting in this article, but I have exposed this server to the public internet so that I may download reports describing visitors to my site. I want to rate-limit the endpoint that will generate these reports so that no one spams this endpoint and racks up a bill on my cloud provisioned server. I've also implemented some basic authentication in addition to the rate limiter.
  • I've defined structs to decode JSON payloads into a format my server can work with.
  • For now I am only storing user information specific to each page of interest in a slice stored in memory. Ultimately this would be better stored in a database, but doing so involves a lot of overhead and right now I am most interested in creating a minimal reproducible example.
  • The main function maps endpoints to functions specific to each page and uses a mux to apply the CORS headers to each endpoint. This way when my website sends a POST request to this server, the fetch() method won't violate CORS policy.
  • Below is the receiveDataFromHomePage() function definition, the other functions mapped to endpoints are similar:
func receiveDataFromHomePage(w http.ResponseWriter, r *http.Request) {
    data, err := collectUserData(w, r)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    homePageVisitors = append(homePageVisitors, data)

    city := data.City
    region := data.Region
    log.Printf("New visitor to home page from: %v, %v\n", city, region)

    w.Header().Set("Content-Type", "application/json")
    w.WriteHeader(http.StatusCreated)
    if err := json.NewEncoder(w).Encode(data); err != nil {
        http.Error(w, "Failed to encode response", http.StatusInternalServerError)
    }
}
                    
package main

import (
    "log"
    "net/http"
    "sync"

    "github.com/rs/cors"
    "golang.org/x/time/rate"
)

type RateLimiter struct {
    limiters map[string]*rate.Limiter
    mu       sync.Mutex
}

type UserData struct {
    UserAgent string `json:"userAgent"`
    Screen    struct {
        Width  int `json:"width"`
        Height int `json:"height"`
    } `json:"screen"`
    Language string `json:"language"`
    Timezone string `json:"timezone"`
    Referrer string `json:"referrer"`
    Date     string `json:"date"`
}

type FullData struct {
    UserData
    IP        string  `json:"ip"`
    City      string  `json:"city"`
    Region    string  `json:"region"`
    Country   string  `json:"country"`
    Latitude  float64 `json:"latitude"`
    Longitude float64 `json:"longitude"`
}

var rateLimiter = NewRateLimiter()

var homePageVisitors = make([]FullData, 0)
var aboutPageVisitors = make([]FullData, 0)
var academicPortfolioPageVisitors = make([]FullData, 0)
var EULAPageVisitors = make([]FullData, 0)
var blogPrivacyPageVisitors = make([]FullData, 0)
var weatherPrivacyPageVisitors = make([]FullData, 0)

func main() {
    mux := http.NewServeMux()

    mux.HandleFunc("/HomePage", receiveDataFromHomePage)
    mux.HandleFunc("/AboutPage", receiveDataFromHAboutPage)
    mux.HandleFunc("/AcademicPage", receiveDataFromAcademicPortfolioPage)
    mux.HandleFunc("/EULAPage", receiveDataFromEULAPage)
    mux.HandleFunc("/BlogPrivacyPage", receiveDataFromBlogPrivacyPage)
    mux.HandleFunc("/WeatherPrivacyPage", receiveDataFromWeatherPrivacyPage)
    mux.HandleFunc("/GetUserInfo", rateLimited(downloadReport))

    c := cors.New(cors.Options{
        AllowedOrigins:   []string{"http://localhost:5500", "http://localhost:5500"},
        AllowedMethods:   []string{"POST"},
        AllowedHeaders:   []string{"Content-Type"},
        AllowCredentials: true,
    })

    handler := c.Handler(mux)

    log.Println("Server is listening on port 4141...")
    if err := http.ListenAndServe(":4141", handler); err != nil {
        log.Fatal("Server failed to start:", err)
    }
}                        
                    

I've defined some helper functions:

package main

import (
    "encoding/json"
    "io"
    "net/http"
)


func getIpInfo() (map[string]interface{}, error) {
    resp, err := http.Get("https://ipapi.co/json/")
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    body, err := io.ReadAll(resp.Body)
    if err != nil {
        return nil, err
    }

    var ipData map[string]interface{}
    if err := json.Unmarshal(body, &ipData); err != nil {
        return nil, err
    }

    return ipData, nil
}

func collectUserData(_ http.ResponseWriter, r *http.Request) (FullData, error) {
    var userData UserData
    if err := json.NewDecoder(r.Body).Decode(&userData); err != nil {
        return FullData{}, err
    }

    ipData, err := getIpInfo()
    if err != nil {
        return FullData{}, err
    }

    fullData := FullData{
        UserData:  userData,
        IP:        getStringFromMap(ipData, "ip"),
        City:      getStringFromMap(ipData, "city"),
        Region:    getStringFromMap(ipData, "region"),
        Country:   getStringFromMap(ipData, "country_name"),
        Latitude:  getFloat64FromMap(ipData, "latitude"),
        Longitude: getFloat64FromMap(ipData, "longitude"),
    }

    return fullData, nil
}

func getStringFromMap(m map[string]interface{}, key string) string {
    if val, ok := m[key]; ok && val != nil {
        return val.(string)
    }
    return ""
}

func getFloat64FromMap(m map[string]interface{}, key string) float64 {
    if val, ok := m[key]; ok && val != nil {
        return val.(float64)
    }
    return 0.0
}
                    

I've made changes to the JavaScript code on the home page:

function collectUserData() {
    const userData = {
        userAgent: navigator.userAgent,
        screen: {
            width: window.screen.width,
            height: window.screen.height
        },
        language: navigator.language || navigator.userLanguage,
        timezone: Intl.DateTimeFormat().resolvedOptions().timeZone,
        referrer: document.referrer,
        date: new Date().toISOString() 
    };
    return userData;
}

async function sendDataToServer() {
    const userData = collectUserData();

    fetch('http://127.0.0.1:4141/HomePage', {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json'
        },
        body: JSON.stringify(userData)
    })
    .then(response => response.json())
    .then(data => console.log('Success:', data))
    .catch((error) => console.error('Error:', error));
}

sendDataToServer();
                    

The end result when running both my web server and the server used to receive user information (which I named Go-Creep) is successful communication between the two servers:



console output

console output

I also confirmed that I could download a report of visitor information from the appropriate endpoint:



console output

console output

console output

Everything is looking good! The next step is deploying Go-Creep to the cloud and modifying the source code of this website to POST the data to the cloud hosted URL. 127.0.0.1 or localhost will resolve to each visitor's own local loopback, which isn't going to be very helpful. In the past, I have deployed servers to Amazon EC2 after containerizing them using Docker. I think this would be a great solution, however, I am going to stop here because I currently use Cloudflare Pages to host this website. In the future, I would like to create my own webserver using Go or Rust in order to host this site, but right now Cloudflare Pages is the most economical option, as I have already maxed out my AWS free tier. If and when I complete this necessary step, I will be sure to update this page.

Final security considerations

Even though I don't have plans to expose the GoCreep server to public internet traffic immediately, I wanted to enforce some security policies to learn more about best practices when using Go to create webservers. In the code below, I've added middleware to add security headers recommended by OWASP. I've restricted endpoints to expected HTTP methods, and I've set timeouts for requests. I've also validated the endpoint function that reads a query parameter to sanitize this input.

func securityHeadersMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        w.Header().Set("Content-Security-Policy", "default-src 'self'")
        w.Header().Set("X-Content-Type-Options", "nosniff")
        w.Header().Set("X-Frame-Options", "DENY")
        w.Header().Set("Strict-Transport-Security", "max-age=63072000; includeSubDomains")
        next.ServeHTTP(w, r)
    })
}

func main() {
    r := mux.NewRouter()

    r.HandleFunc("/HomePage", GoCreep.ReceiveDataFromHomePage).Methods("POST")
    r.HandleFunc("/AboutPage", GoCreep.ReceiveDataFromHAboutPage).Methods("POST")
    r.HandleFunc("/AcademicPage", GoCreep.ReceiveDataFromAcademicPortfolioPage).Methods("POST")
    r.HandleFunc("/EULAPage", GoCreep.ReceiveDataFromEULAPage).Methods("POST")
    r.HandleFunc("/BlogPrivacyPage", GoCreep.ReceiveDataFromBlogPrivacyPage).Methods("POST")
    r.HandleFunc("/WeatherPrivacyPage", GoCreep.ReceiveDataFromWeatherPrivacyPage).Methods("POST")
    r.HandleFunc("/GetUserInfo", GoCreep.RateLimited(GoCreep.DownloadReport)).Methods("GET")

    r.Use(securityHeadersMiddleware)

    c := cors.New(cors.Options{
        AllowedOrigins:   []string{"http://localhost:5500", "http://localhost:5500"}, // change accordingly
        AllowedMethods:   []string{"POST"},
        AllowedHeaders:   []string{"Content-Type"},
        AllowCredentials: true,
    })

    handler := c.Handler(r)

    srv := &http.Server{
        Handler:      handler,
        Addr:         "0.0.0.0:4141",
        WriteTimeout: 15 * time.Second,
        ReadTimeout:  15 * time.Second,
        IdleTimeout:  60 * time.Second,
    }

    log.Println("Server is listening on port 4141...")
    if err := srv.ListenAndServe(); err != nil {
        log.Fatal("Server failed to start:", err)
    }
                    
\
func validateToken(token string) error {
    if len(token) == 0 {
        return errors.New("token is empty")
    }
    
    // Check if the token is alphanumeric
    for _, char := range token {
        if !unicode.IsLetter(char) && !unicode.IsDigit(char) {
            return errors.New("token contains invalid characters")
        }
    }
    
    return nil
}

func DownloadReport(w http.ResponseWriter, r *http.Request) {
    queryParams := r.URL.Query()
    providedToken := queryParams.Get("token")

    if err := validateToken(providedToken); err != nil || providedToken != token {
        http.Error(w, "Forbidden", http.StatusForbidden)
        return
    }

    if err := writeVisitorDataToFiles(); err == nil {
        zipFilename := "visitor_data.zip"
        files := []string{
            "homePageVisitors.json",
            "aboutPageVisitors.json",
            "academicPortfolioPageVisitors.json",
            "EULAPageVisitors.json",
            "blogPrivacyPageVisitors.json",
            "weatherPrivacyPageVisitors.json",
        }

        err := createZipArchive(files, zipFilename)
        if err != nil {
            http.Error(w, "Could not create zip archive", http.StatusInternalServerError)
            log.Println("Error creating zip archive:", err)
            return
        }

        http.ServeFile(w, r, zipFilename)
    } else {
        http.Error(w, "Internal Server Error", http.StatusInternalServerError)
    }

}
                    

View the project source code on GitHub

Top Of Page