How to Use cURL With Proxy?
Flipnode on Mar 13 2023
In this comprehensive tutorial, you will learn how to utilize cURL (also known as curl) with proxy servers in a step-by-step manner. Starting from the installation process, we will cover all the aspects of using proxy servers, including various options for setting up the proxy. It is important to note that this guide is not limited to any specific proxy service, as it can be applied to any proxy server by simply providing the server details and credentials.
While this tutorial delves into technical aspects, readers are assumed to have a basic understanding of proxies. It is particularly beneficial for those starting out with web scraping, as it offers valuable insights and practical knowledge.
What is cURL?
cURL is a powerful command line tool that enables users to send and receive data through URLs. To better understand its functionality, let's explore a basic example of how to use curl. Simply open your terminal or command prompt and enter the following command:
By executing this command, you will obtain the HTML code of the webpage and display it on the console. Additionally, you can retrieve document information by adding the -I flag to the command:
curl https://www.google.com -I
This will display the HTTP/1.1 200 OK message, along with the content type and character encoding. For more information on using cURL, check out our link to GitHub
cURL is widely available on Linux distributions and MacOS, and it can now be accessed on Windows 10 as well.
If your Linux distribution doesn't have cURL pre-installed, you can easily install it by running a simple install command. For instance, on Ubuntu, you can open the Terminal and enter the following command:
sudo apt install curl
For additional information on cURL installation, please visit our link to GitHub. In the event that you are utilizing an older version of Windows or desire to install an alternate version, you can download curl from the official download page.
What you need to connect to a proxy
When it comes to using a proxy server, there are a few key pieces of information that you will need to know, regardless of which proxy service you choose to utilize. These include the proxy server address, port, protocol, and any necessary authentication details such as a username and password.
In this tutorial, we will assume that the proxy server address is 127.0.0.1, the port is 1234, the username is "user," and the password is "pwd." It is worth noting that while we will provide examples of various protocols, the steps outlined in this tutorial are not limited to any specific proxy service, meaning that you can apply this knowledge to any proxy server by simply providing the necessary server details and credentials.
If you happen to be connected to a network that utilizes NTLM authentication, you can use the switch –proxy-ntlm with curl to authenticate your connection. Similarly, –proxy-digest can be used to establish a connection via digest authentication. To discover all of the available options, you can run curl –help.
Throughout this tutorial, we will provide numerous examples of how to use cURL with proxy servers, with a particular focus on the most common scenario - HTTP and HTTPS proxy usage. Whether you are a novice or an experienced user, this tutorial will offer valuable insights and practical knowledge that will help you better understand how to use cURL with proxy servers effectively.
Using cURL with HTTP/HTTPS proxy
As we discussed earlier, curl is a powerful command line tool that enables you to send and receive data using URLs. One of the most useful applications of curl is its ability to work with proxy servers. Proxy servers act as intermediaries between your computer and the internet, allowing you to browse the web anonymously and access content that may be restricted in your region.
If you want to test out proxies, https://httpbin.org/ip is an excellent website to try. This site allows you to view the output of the origin IP address, making it easy to determine if the proxy is working correctly. When using a proxy, the returned IP address should differ from that of your machine and instead display the proxy's IP address.
There are multiple ways to run curl with a proxy command. One of the most common methods is to send proxy details as a command line argument. This involves specifying several pieces of information, including the proxy server address, port, protocol, username (if authentication is required), and password (if authentication is required). Once you have this information, you can input it into the command line along with the URL you wish to access.
It's essential to keep in mind that all command line options, or switches, are case sensitive. For instance, using the switch -f instructs curl to fail silently, while -F denotes a form to be submitted. By understanding these nuances, you can effectively use curl with proxy and achieve your desired results.
Command line argument to set proxy in cURL
To access the list of options for curl, open your terminal and type "curl --help" followed by the Enter key. The output will display a comprehensive list of curl options, including the "-x" or "--proxy" switch used for proxy details. Note that the switch is case-sensitive and the proxy details can be supplied using either "-x" or "--proxy" switch. The following two commands are equivalent:
curl -x "http://user:[email protected]:1234" "http://httpbin.org/ip"
curl --proxy "http://user:[email protected]:1234" "http://httpbin.org/ip"
It's important to note that if there are SSL certificate errors, you should add the "-k" switch to the curl command to allow insecure server connections when using SSL. Another recommended practice is to surround both the proxy URL and target URL with double quotes to handle special characters in the URL. Additionally, it's worth noting that the default protocol for a proxy is HTTP, so both the commands above will have the same effect.
Using environment variables
There's another way to use a proxy with curl, which is to set environment variables http_proxy and https_proxy. However, please note that this method is only applicable for MacOS and Linux users. For Windows users, the next section will explain how to use curlrc file. These variable names show the protocol for which the proxies will be used, and not for the protocol used by the proxy server itself. To use the proxy, simply set the http_proxy variable to the http proxy address and https_proxy variable to the https proxy address. To do this, open the terminal and execute these two commands:
export http_proxy="http://user:[email protected]:1234"
export https_proxy="http://user:[email protected]:1234"
After executing these commands, you can run curl as usual:
If you encounter SSL certificate errors, you can add the -k option to ignore them. Please note that these variables apply system-wide. If you don't want this behavior, you can unset the two variables using the following commands:
In the next section, we will explain how to set the default proxy only for curl, without affecting the entire system.
Configure cURL to always use proxy
To enable a proxy for curl while leaving other programs unaffected, you can create a curl config file. On Linux and MacOS, navigate to your home directory in the terminal and open the .curlrc file if it already exists. If not, create a new file using the following commands:
Then add this line to the file:
Save the file and curl is now ready to use the proxy. When running curl normally, it will automatically read the proxy from the .curlrc file. On Windows, the file is named _curlrc and can be placed in the %APPDATA% directory. To find the path of %APPDATA%, open the command prompt and run this command:
The resulting directory, such as C:\Users<your_user>\AppData\Roaming, can be used to create the _curlrc file and set the proxy in the same way as on Linux and MacOS.
Ignore or override proxy for one request
To modify or bypass the proxy settings for a specific request, the -x or --proxy switch can be used as usual, even if the proxy has been set globally or through the .curlrc file.
To bypass the proxy completely for a request, the cURL noproxy command can be used by passing --noproxy followed by an asterisk (*). This instructs cURL to not use the proxy for any URLs.
For situations where many curl requests need to be executed without changing the system-wide proxy settings, the following section provides a solution.
Bonus tip – turning proxies off and on quickly
For advanced users only, here's a tip to create aliases in the .bashrc file for setting and unsetting proxies. If you're not familiar with this file, feel free to skip this section.
To get started, open your .bashrc file using your preferred editor and add these two lines:
alias proxyon="export http_proxy=' http://user:[email protected]:1234';export https_proxy=' http://user:[email protected]:1234'"
alias proxyoff="unset http_proxy;unset https_proxy"
Once you've added these lines, save the .bashrc file and update your shell to read the changes. To do this, run the following command in the terminal:
Now, when you need to turn on the proxy, simply run the "proxyon" command. You can then execute one or more curl commands, and when you're finished, turn off the proxies by running the "proxyoff" command. Here's an example of how to use these aliases:
cURL socks proxy
The syntax for setting a proxy server using socks protocol is the same as for other protocols:
curl -x "socks5://user:[email protected]:1234" "http://httpbin.org/ip"
You can use socks4://, socks4a://, socks5:// or socks5h:// depending on the socks version.
Alternatively, you can use the –socks5 switch to set a curl socks proxy. To send the username and password, use the –proxy-user switch. Here's an example:
curl --socks5 "127.0.0.1:1234" "http://httpbin.org/ip" --proxy-user user:pwd
You can use –socks4, –socks4a or –socks5 depending on the version. Note that this tip is more suitable for advanced users.
Undoubtedly, cURL is a highly effective automation tool, offering unparalleled proxy support through its command-line interface. Moreover, owing to the compatibility of libcurl with PHP, several web applications leverage it for their web scraping projects, making it an indispensable tool for any web scraper.