Archive for 十二月, 2005

Overcoming MySQL’s 4GB Limit

星期四, 十二月 29th, 2005

(After having explained this for the 35th time, I decided it’s time to simply put something on-line.)


When a MyISAM table grows large enough, you’ll encounter the dreaded “The table is full” error. Now I could simply point at that page and leave this subject alone, but there’s more to this story.


When this happens, the first reaction I hear is “You never told me that MySQL has a 4GB limit! What am I going to do?” Amusingly, I usually do describe the limit when I discuss the possibility of using MySQL with various groups–they often forget or underestimate the impact it will have. Putting that aside, the problem is easily fixed, as that page explains. You simply need to run an ALTER TABLE command.


And you’ll need to wait. That ALTER TABLE is going to take some time. Really.


To protect yourself in the future, use the MAX_ROWS and AVG_ROW_LENGTH options at CREATE TABLE time if the table is likely to get big.


InnoDB tables to not have this limitation because their storage model is completely different.


Where does this limit come from?


In a MyISAM table with dynamic (variable length) rows, the index file for the table (tablename.MYI) stores row locations using 32-bit pointers into the data file (tablename.MYD). That means it can address only 4GB of space.


This problem is both a historical artifact and an optimization. Back when MySQL was created, it wasn’t common to store that much data in a single table. Heck, for a long time 4GB was an entire hard disk and most operating systems had trouble with files larger than 2GB. Obviously those days are gone. Modern operating systems have no trouble with large files and hard disks larger than 100GB are quite common.


From an optimization point of view, however, the 32-bit pointer still makes sense. Why? Because most people are running MySQL on 32-bit hardware (Intel/Linux). That will change as use of AMD’s Opteron becomes more widespread, but 32-bit will be the majority for the next few years. Using 32-bit pointers is the most efficient way to do this on 32-bit hardware. And even today, most MySQL installations don’t have tables anywhere near 4GB in size. Sure, there are a lot of larger deployments emerging. They’re all relatively new.


An Example


Here’s a table that you might use to store weather data:

mysql> describe weather;
+———–+————–+——+—–+————+——-+
| Field | Type | Null | Key | Default | Extra |
+———–+————–+——+—–+————+——-+
| city | varchar(100) | | MUL | | |
| high_temp | tinyint(4) | | | 0 | |
| low_temp | tinyint(4) | | | 0 | |
| the_date | date | | | 0000-00-00 | |
+———–+————–+——+—–+————+——-+
4 rows in set (0.01 sec)

To find its size limit, we’ll use SHOW TABLE STATUS

mysql> show table status like ‘weather’ \G
*************************** 1. row ***************************
Name: weather
Type: MyISAM
Row_format: Dynamic
Rows: 0
Avg_row_length: 0
Data_length: 0
Max_data_length: 4294967295
Index_length: 1024
Data_free: 0
Auto_increment: NULL
Create_time: 2003-03-03 00:43:43
Update_time: 2003-03-03 00:43:43
Check_time: 2003-06-14 15:11:21
Create_options:
Comment:
1 row in set (0.00 sec)

There it is. Notice that Max_data_length is 4GB. Let’s fix that.

mysql> alter table weather max_rows = 200000000000 avg_row_length = 50;
Query OK, 0 rows affected (0.03 sec)
Records: 0 Duplicates: 0 Warnings: 0

mysql> show table status like ‘weather’ \G
*************************** 1. row ***************************
Name: weather
Type: MyISAM
Row_format: Dynamic
Rows: 0
Avg_row_length: 0
Data_length: 0
Max_data_length: 1099511627775
Index_length: 1024
Data_free: 0
Auto_increment: NULL
Create_time: 2003-06-17 13:12:49
Update_time: 2003-06-17 13:12:49
Check_time: NULL
Create_options: max_rows=4294967295 avg_row_length=50
Comment:
1 row in set (0.00 sec)


Excellent. Now MySQL will let us store a lot more data in that table.


Too Many Rows?


Now, the astute reader will notice the Create_options specify a limit of 4.2 billion rows. That’s right, there’s still a limit, but now it’s a limit on number of rows, not the size of the table. Even if you have a table with rows that are 10 times as large, you’re still limited to roughly 4.2 billion rows.


Why?


Again, this is 32-bit hardware. If you move to a 64-bit system, the limit is raised accordingly.


原文:http://jeremy.zawodny.com/blog/archives/000796.html

mysql jdbc在mysql下中文乱码解决

星期二, 十二月 27th, 2005

configure
./configure –prefix=/usr/local/mysql –with-charset=gb2312 –with-extra-charsets=gbk,utf8


my.cnf增加这两行
[mysqld]
default-character-set=gbk


[client]
default-character-set=gbk


jdbc url:
jdbc:mysql://192.168.0.247:3306/bench?useUnicode=true&characterEncoding=GBK

squid反向代理

星期一, 十二月 5th, 2005

只说明几个关键的配置:


打开squid.conf


#squid服务器用80端口作为访问端口
http_port 80
#要代理的http服务器ip/域名
httpd_accel_host localhost
#如果是虚拟主机
httpd_accel_host virtual test.mydomain.com
#代理http服务器端口
httpd_accel_port 80


squid.conf sample:
http_port 80
icp_port 0
acl ALL_QUERY urlpath_regex [.]*
http_access allow ALL_QUERY
cache_mem 16 MB
cache_dir ufs /tmp 256 16 256
log_icp_queries off
buffered_logs on
emulate_httpd_log on
redirect_rewrites_host_header off
half_closed_clients off
acl all src 0.0.0.0/0.0.0.0
http_access allow all
cache_mgr admin
httpd_accel_host www.cruelyouth.com
httpd_accel_port 80
visible_hostname cache1
cache_effective_user squid_user

squid基本安装运行

星期一, 十二月 5th, 2005

1、下载一个可编译的稳定版本,目前版本应为2.5
ftp://www.squid-cache.org/pub/squid-2/STABLE/


2、安装squid-2.5.STABLE5.tar.gz
        % tar xzf squid-2.5.STABLE5.tar.gz
        % cd squid-2.5.STABLE5
        % ./configure –prefix=/usr/local/squid
        % make
        % make install


附configure options


  –prefix=PREFIX         install architecture-independent files in PREFIX
                          [/usr/local/squid]
  –enable-dlmalloc[=LIB] Compile & use the malloc package by Doug Lea
  –enable-gnuregex       Compile GNUregex
  –enable-splaytree      Use SPLAY trees to store ACL lists
  –enable-xmalloc-debug  Do some simple malloc debugging
  –enable-xmalloc-debug-trace
                          Detailed trace of memory allocations
  –enable-xmalloc-statistics
                          Show malloc statistics in status page
  –enable-carp           Enable CARP support
  –enable-async-io       Do ASYNC disk I/O using threads
  –enable-icmp           Enable ICMP pinging
  –enable-delay-pools    Enable delay pools to limit bandwith usage
  –enable-mem-gen-trace  Do trace of memory stuff
  –enable-useragent-log  Enable logging of User-Agent header
  –enable-kill-parent-hack
                          Kill parent on shutdown
  –enable-snmp           Enable SNMP monitoring
  –enable-cachemgr-hostname[=hostname]
                          Make cachemgr.cgi default to this host
  –enable-arp-acl        Enable use of ARP ACL lists (ether address)
  –enable-htpc           Enable HTCP protocol
  –enable-forw-via-db    Enable Forw/Via database
  –enable-cache-digests  Use Cache Digests
                          see http://www.squid-cache.org/Doc/FAQ/FAQ-16.html
  –enable-err-language=lang
                          Select language for Error pages (see errors dir)


3、现在squid安装在/usr/local/squid下
要启动需编辑etc/squid.conf文件
cache_effective_user=squid
cache_effective_group=squid
然后将/usr/local/squid/var/logs目录分配给squid用户
        % chown squid.squid -R /usr/local/squid/var/logs


4、创建swap空间
        % /usr/local/squid/sbin/squid -z

5、检查配置
       % /usr/local/squid/sbin/squid -k parse
如果没有任何提示则表明配置无误。


6、启动
调试模式:
       % /usr/local/squid/sbin/squid -NCd1
服务模式
       % /usr/local/squid/sbin/squid


7、关闭
       % /usr/local/squid/sbin/squid -k shutdown


附录,squid 命令参数


-a

Specify an alternate port number for incoming HTTP requests. Useful for testing a configuration file on a non-standard port.


-d

Debugging level for “stderr” messages. If you use this option, then debugging messages up to the specified level will also be written to stderr.


-f

Specify an alternate squid.conf file instead of the pathname compiled into the executable.


-h

Prints the usage and help message.


-k reconfigure

Sends a HUP signal, which causes Squid to re-read its configuration files.


-k rotate

Sends an USR1 signal, which causes Squid to rotate its log files. Note, if logfile_rotate is set to zero, Squid still closes and re-opens all log files.


-k shutdown

Sends a TERM signal, which causes Squid to wait briefly for current connections to finish and then exit. The amount of time to wait is specified with shutdown_lifetime.


-k interrupt

Sends an INT signal, which causes Squid to shutdown immediately, without waiting for current connections.


-k kill

Sends a KILL signal, which causes the Squid process to exit immediately, without closing any connections or log files. Use this only as a last resort.


-k debug

Sends an USR2 signal, which causes Squid to generate full debugging messages until the next USR2 signal is recieved. Obviously very useful for debugging problems.


-k check

Sends a “ZERO” signal to the Squid process. This simply checks whether or not the process is actually running.


-s

Send debugging (level 0 only) message to syslog.


-u

Specify an alternate port number for ICP messages. Useful for testing a configuration file on a non-standard port.


-v

Prints the Squid version.


-z

Creates disk swap directories. You must use this option when installing Squid for the first time, or when you add or modify the cache_dir configuration.


-D

Do not make initial DNS tests. Normally, Squid looks up some well-known DNS hostnames to ensure that your DNS name resolution service is working properly.


-F

If the swap.state logs are clean, then the cache is rebuilt in the “foreground” before any requests are served. This will decrease the time required to rebuild the cache, but HTTP requests will not be satisified during this time.


-N

Do not automatically become a background daemon process.


-R

Do not set the SO_REUSEADDR option on sockets.


-V

Enable virtual host support for the httpd-accelerator mode. This is identical to writing httpd_accel_host virtual in the config file.


-X

Enable full debugging while parsing the config file.


-Y

Return ICP_OP_MISS_NOFETCH instead of ICP_OP_MISS while the swap.state file is being read. If your cache has mostly child caches which use ICP, this will allow your cache to rebuild faster.



原文参考:http://www.squid-cache.org/Doc/FAQ/FAQ.html

squid: part of Access Control configuration

星期四, 十二月 1st, 2005

ACL elements


Note: The information here is current for version 2.5.


Squid knows about the following types of ACL elements:


  • src: source (client) IP addresses
  • dst: destination (server) IP addresses
  • myip: the local IP address of a client’s connection
  • srcdomain: source (client) domain name
  • dstdomain: destination (server) domain name
  • srcdom_regex: source (client) regular expression pattern matching
  • dstdom_regex: destination (server) regular expression pattern matching
  • time: time of day, and day of week
  • url_regex: URL regular expression pattern matching
  • urlpath_regex: URL-path regular expression pattern matching, leaves out the protocol and hostname
  • port: destination (server) port number
  • myport: local port number that client connected to
  • proto: transfer protocol (http, ftp, etc)
  • method: HTTP request method (get, post, etc)
  • browser: regular expression pattern matching on the request’s user-agent header
  • ident: string matching on the user’s name
  • ident_regex: regular expression pattern matching on the user’s name
  • src_as: source (client) Autonomous System number
  • dst_as: destination (server) Autonomous System number
  • proxy_auth: user authentication via external processes
  • proxy_auth_regex: user authentication via external processes
  • snmp_community: SNMP community string matching
  • maxconn: a limit on the maximum number of connections from a single client IP address
  • req_mime_type: regular expression pattern matching on the request content-type header
  • arp: Ethernet (MAC) address matching
  • rep_mime_type: regular expression pattern matching on the reply (downloaded content) content-type header. This is only usable in the http_reply_access directive, not http_access.
  • external: lookup via external acl helper defined by external_acl_type




Access Lists


There are a number of different access lists:


  • http_access: Allows HTTP clients (browsers) to access the HTTP port. This is the primary access control list.
  • http_reply_access: Allows HTTP clients (browsers) to receive the reply to their request. This further restricts permissions given by http_access, and is primarily intended to be used together with the rep_mime_type acl type for blocking different content types.
  • icp_access: Allows neighbor caches to query your cache with ICP.
  • miss_access: Allows certain clients to forward cache misses through your cache. This further restricts permissions given by http_access, and is primarily intended to be used for enforcing sibling relations by denying siblings from forwarding cache misses through your cache.
  • no_cache: Defines responses that should not be cached.
  • redirector_access: Controls which requests are sent through the redirector pool.
  • ident_lookup_access: Controls which requests need an Ident lookup.
  • always_direct: Controls which requests should always be forwarded directly to origin servers.
  • never_direct: Controls which requests should never be forwarded directly to origin servers.
  • snmp_access: Controls SNMP client access to the cache.
  • broken_posts: Defines requests for which squid appends an extra CRLF after POST message bodies as required by some broken origin servers.
  • cache_peer_access: Controls which requests can be forwarded to a given neighbor (peer).