Codex

Interested in functions, hooks, classes, or methods? Check out the new WordPress Code Reference!

pt-br:Validação de Informações

Dados não confiáveis podem chegar ao seu site de várias maneiras (usuários, outros sites, o seu banco de dados, ...) e por este motivo é importante que todos as informações do seu site sejam veficadas tanto quando enviadas como quando forem aprensentadas ao usuário.

Limpeza de dados

Este tipo de limpeza depende do contexto em que será usado e que tipo de informações serão analizadas. Abaixo seguem algumas das mais comuns tarefas realizadas no WordPress e como elas são verificadas.

Dica: É importante que ao validar uma informação para ser escrita no navegador a limpeza dos dados seja a última coisa a ser feita no código, como exemplo, no momento quando ela é escrita.

Inteiros

intval( $int ) ou (int) $int
Se a variável é um inteiro, declare ela como um.
absint( $int )
Garante que é um inteiro não negativo.

HTML/XML

É legal lembrar que muitos tipos de documentos XML (diferente dos documentos HTML) entendem apenas alguns caracteres de referência: apos, amp, gt, lt, quot. Quando você estiver lidando com dados para XML, tenha cuidado em sempre filtrar os textos que possam conter caracteres inválidos usando a função do WordPress:

ent2ncr( $text )

Códigos HTML/XML

wp_kses( (string) $codigo_html, (array) $html_permitido, (array) $protocolos = null )
Todo código HTML não confiável(conteúdo de um artigo, texto de um comentário, etc.) deve ser tratado com a função wp_kses().
Para evitar o trabalho de ter que passar um array com todas as tags HTML permitidas, você pode usar wp_kses_post( (string) $codigo_html) para ter as mesmas permissões de um post ou página ou wp_kses_data( (string) $codigo_html ) para usar a lista permitida nos comentários.
wp_rel_nofollow( (string) $html )
Adiciona o atributo "rel='nofollow'" para qualquer tag <a>.

Pedaços de Texto

esc_html( $texto ) (desde a 2.8)
Codifica < > & " ' (menor que, maior que, E comercial, aspas compostas, aspas simples). Similar a função esc_attr.
esc_html__ (desde a 2.8)
Traduz e Codigica
esc_html_e (desde a 2.8)
Traduz, codifica e escreve no navegador
esc_textarea (desde a 3.1)
Codifica para ser usado dentro de uma área de texto
wp_specialchars( $string, $quote_style = ENT_NOQUOTES, $charset = false, $double_encode = false ) (obsoleta desde a 2.8)
Codifica < > & (menor que, maior que, E comercial). Nunca irá codificar duas vezesum caracter. Desde a versão 2.8, se chamado com apenas 1 argumento irá codificar Aspas também (usando esc_html), como proteção para plugins antigos.
htmlspecialchars( $texto, ENT_NOQUOTES )
Codifica < > &. irá repetir a codificação se for rodado mais de uma vez em um mesmo texto.

Ainda em Inglês

Attribute Nodes

esc_attr( $text ) (since 2.8)
attribute_escape( $text ) (obsoleta desde a2.8)
Encodes < > & " ' (less than, greater than, ampersand, double quote, single quote). Will never double encode entities. See esc_url() in #URLs
esc_attr__()
Translates and encodes
esc_attr_e()
Translates, encodes, and echoes
htmlspecialchars( $text, ENT_QUOTES )
Encodes < > & " '. Will double encode html entities if run twice. See esc_url() in #URLs

JavaScript

esc_js( $text ) (since 2.8)
js_escape( $text ) (deprecated since 2.8)
Escapes ', encodes ", and fixes line endings.

URLs

esc_url( $url, (array) $protocols = null ) (since 2.8)
Always use esc_url when sanitizing URLs (in text nodes, attribute nodes or anywhere else). Rejects URLs that do not have one of the provided whitelisted protocols (defaulting to http, https, ftp, ftps, mailto, news, irc, gopher, nntp, feed, and telnet), eliminates invalid characters, and removes dangerous characters. Replaces clean_url() which was deprecated in 3.0.
This function encodes characters as HTML entities: use it when generating an (X)HTML or XML document. Encodes ampersands (&) and single quotes (') as numeric entity references (&#038, &#039).
esc_url_raw( $url, (array) $protocols = null ) (since 2.8)
For inserting an URL in the database. This function does not encode characters as HTML entities: use it when storing a URL or in other cases where you need the non-encoded URL. This functionality can be replicated in the old clean_url function by setting $context to db.
urlencode( $scalar )
Encodes for use in URL (as a query parameter, for example)
urlencode_deep( $array )
urlencodes all array elements.

Database

$wpdb->insert( $table, (array) $data )
$data should be unescaped (the function will escape them for you). Keys are columns, Values are values.
$wpdb->update( $table, (array) $data, (array) $where )
$data should be unescaped. Keys are columns, Values are values. $where should be unescaped. Multiple WHERE conditions are ANDed together.
$wpdb->update(
  'my_table',
  array( 'status' => $untrusted_status, 'title' => $untrusted_title ),
  array( 'id' => 123 )
);
$wpdb->prepare( $format, (scalar) $value1, (scalar) $value2, ... )
$format is a sprintf() like format string. It only understands %s and %d, neither of which needs to be enclosed in quotation marks.
$wpdb->get_var( $wpdb->prepare(
  "SELECT something FROM table WHERE foo = %s and status = %d",
  $name, // an unescaped string (function will do the sanitation for you)
  $status // an untrusted integer (function will do the sanitation for you)
) );
esc_sql( $sql ) (since 2.8)
$wpdb->escape( $text )
Escapes a single string for use in a SQL query. Glorified addslashes().
$wpdb->escape_by_ref( &$text )
No return value.
like_escape( $string )
Sanitizes $string for use in a LIKE expression of a SQL query. Will still need to be SQL escaped (with one of the above functions).

Filesystem

validate_file( (string) $filename, (array) $allowed_files = "" )
Used to prevent directory traversal attacks, or to test a filename against a whitelist. Returns 0 if $filename represents a valid relative path. After validating, you must treat $filename as a relative path (i.e. you must prepend it with an absolute path), since something like /etc/hosts will validate with this function. Returns an integer greater than zero if the given path contains .., ./, or :, or is not in the $allowed_files whitelist. Be careful making boolean interpretations of the result, since false (0) indicates the filename has passed validation, whereas true (> 0) indicates failure.

HTTP Headers

Header splitting attacks are annoying since they are dependent on the HTTP client. WordPress has little need to include user generated content in HTTP headers, but when it does, WordPress typically uses whitelisting for most of its HTTP headers.

WordPress does use user generated content in HTTP Location headers, and provides sanitation for those.

wp_redirect($location, $status = 302)
A safe way to redirect to any URL. Ensures the resulting HTTP Location header is legitimate.
wp_safe_redirect($location, $status = 302)
Even safer. Only allows redirects to whitelisted domains.

Input Validation

Many of the functions above in #Output_Sanitation are useful for input validation. In addition, WordPress uses the following functions.

Slugs

sanitize_title( $title )
Used in post slugs, for example
sanitize_user( $username, $strict = false )
Use $strict when creating a new user (though you should use the API for that).

HTML

balanceTags( $html ) or force_balance_tags( $html )
Tries to make sure HTML tags are balanced so that valid XML is output.
tag_escape( $html_tag_name )
Sanitizes an HTML tag name (does not escape anything, despite the name of the function).
sanitize_html_class( $class, $fallback )
Santizes a html classname to ensure it only contains valid characters. Strips the string down to A-Z,a-z,0-9,'-' if this results in an empty string then it will return the alternative value supplied.

Email

is_email( $email_address )
returns boolean false if invalid, or $email_address if valid

Arrays

array_map( 'absint', $array )
Ensures all elements are nonnegative integers. Replace callback with whatever is appropriate for your data.

Validation Philosophies

There are several different philosophies about how validation should be done. Each is appropriate for different scenarios.

Whitelist

Accept data only from a finite list of known and trusted values.

$possible_values = array( 'a', 1, 'good' );
if ( !in_array( $untrusted, $possible_values ) )
  die( "Don't do that!" );
// Be careful here with fancy breaks and default actions.
switch ( $untrusted ) {
case 'a' :
  ...
  break;
...
default :
  die( "You hoser!" );
}

Blacklist

Reject data from finite list of known untrusted values. This is very rarely a good idea.

Format Detection

Test to see if the data is of the correct format. Only accept it if it is.

if ( !ctype_alnum( $data ) )
  die( "Your data is teh suX0R" );
if ( preg_match( "/[^0-9.-]/", $data ) )
  die( "Float on somewhere else, jerky" );

Format Correction

Accept most any data, but remove or alter the dangerous pieces.

$trusted_integer = (int) $untrusted_integer;
$trusted_alpha = preg_replace( '/[^a-z]/i', "", $untrusted_alpha );
$trusted_slug = sanitize_title( $untrusted_slug );

Changelog

  • 3.1: Introduced esc_textarea. (#15454)
  • 3.0: Deprecated clean_url() in favor of esc_url() and esc_url_raw(). (#12309)
  • 2.8: Deprecated the following functions. (via WordPress Development Updates)
    • clean_url() -> esc_url()
    • sanitize_url() -> esc_url_raw()
    • wp_specialchars() -> esc_html() (also: esc_html__() and esc_html_e())
    • attribute_escape() -> esc_attr() (also: esc_attr__() and esc_attr_e())


Veja também wp_specialchars() vs attribute_escape() ( agora esc_attr() ).