Documentaion

Directory Structure

Script Requirements

All of the following scripts must be present in the same directory (script_dir):

class_fit2.py

class_hb.py

class_titr.py

create_titr.py

fitting2.py

fitting_models.py

make_figinstr2.py

misc_functions.py

program_scripts.py

settings.py

figures2.py

To run scripts, you must execute them by typing: "python script_dir/script.py"

General Flow

Directories:

Files:

1. Create a directory where the analysis will take place (/analysis)

2. Prepare peak list files.

· Sparky Peak List Files: For each spectrum in Sparky, create a peak list file (use the lt command) and save it in the analysis directory. Files must have a ‘.list’ extension.

· Sparky Save Files: Using the Harms’ sparky_extract-peaks.py program, generate a peak summary file with extension ‘.mjh’.

· Manually created files: Create your own titration file. File must be tab-delimited and the first row of the file must contain headers. The file name must end with ‘CS-pH.txt’.

3. Create a titration.log file in the analysis directory. This file will store the information about the titration – protein, salt condition, temperature, magnet, and the name of the titration experiment. The name of the experiment is the most critical, as it will be used by downstream programs for plotting purposes.

4. In the analysis directory, run create_titr.py This script will generate.

# Fit control file example

# 20060515

fit_type: one_site_mHb, fit_program: R, atom_type: CG, res_type: ASP, which: all

ASP_19, not-fitable ASP_21, not-fitable

ASP_40, CS_AH1: 175.37, CS_A_: 179.10, pKa1: 3.85, n1: 0.57

ASP_77, not-fitable ASP_83, not-fitable

ASP_95, not-fitable

ASP_143, CS_AH1: 176.72, CS_A_: 180.08, pKa1: 3.78, n1: 0.76

ASP_146, CS_AH1: 176.84, CS_A_: 180.37, pKa1: 3.85, n1: 0.76

end

fit_type: two_site_b, fit_program: R, atom_type: CG, res_type: ASP, which: select

ASP_21, CS_AH2: 176.60, CS_AH1: 176.20, CS_A_: 179.87, pKa1: 3.01, pKa2: 6.54

end

If a particular resonance can not be fitted (e.g. Asp-77 chemical shift exhibits no pH dependence) then the line should read ‘ASP_77, not-fitable’).

The format of a fit instruction line should be:
[residue type]_[residue number], [parameter_name]: [value], [parameter_name]: [value]
example:

GLU_43, CS_AH1: 179.69, CS_A_: 184.07, pKa1: 4.32, n1: 0.70

ASP_95, fixed~CS_AH1: 178.01, CS_A_: 181.66, pKa1: 2.0, n1: 0.8

Lines that are not read by fitting.py begin with ‘#’. The file is broken up into sections that begin with ‘fit_type’ and end with ‘end’. Each section uses a different fit model and fits different residue and atom types. There are five essential parameters on the fit_type line:

To fix a parameter, please add “fixed~” to the front of the parameter.

6. Run fitting2.py in the analysis directory. This script uses the gnls function in R. This script will read the .fit files and create

7. Create a subdirectory called ‘plots’.

8. Copy (or symbolically link) the titr_fit.pickle file (if already fitted) or the titr.pickle file to the plots directory

9. Within the plots directory, create a figure instruction file by running make_figinstr2.py. This script will create a file called instructions.txt with plot parameters. All of the fits performed will be listed in this file

10. Edit the instructions.txt file. Here is an example file:

rows: 5, columns: 4, page_margins: 0.5_0.5_0.5_0.5

x_range: 1_9, y_amplitude: 4.5, plot_margins: 0.15_0.15_0.15_0.15, show_pKa: y

name: DPPHS_25C_0p10M-KCl_2

figure_1

ASP_19_CG, DPPHS_25C_0p10M-KCl_2: two_site_b--fixed~CS_AH2:177.43

figure_2 ASP_21_CG, DPPHS_25C_0p10M-KCl_2: two_site_b

Instructions for each data set to be plotted is written as follows:
[descriptor], [titration_name]: [fit_name]

GLU_101_CD, DPPHS_25C_0p10M-KCl_2: one_site_mHb

The descriptor contains three fields: residue_type, residue_number, and atom_type. These fields are combined into one string using ‘_’.

If multiple fits are to be plotted for the same data set, the fit names must be separated by a comma:

one_site_mHb, one_site_mHb--fixed~CS_AH1:179.18

If multiple data sets are to be plotted on the same figure, they only need to be defined on separate lines:

figure_6
ASP_95_CG, DPPHS_25C_0p10M-KCl_3: one_site_mHb--fixed~CS_AH1:178.02
ASP_95_CG, DPPHS_25C_1p00M-KCl_1: one_site_mHb

11. Edit settings.py to incorporate the plot settings for that titration data set:

12. Run figures2.py by typing "figures2.py instructions.txt" in the plots directory. This will create a instructions_.ps file which you can open in Ghostscript/Ghostview (on Windows) or Preview (on Macs) or display (on Unix).

'DPPHS_25C_0p10M-KCl_1' : {'color': 'black', 'pch': 21, 'bg': 'red'}

NMR Data fitting tool

Documentation

analysis	titration peak lists analyzed for pKa determination and plots of chemical shift data vs. pH.
postscripts	postscript files of spectra (Sparky’s printing service was used for these)
sparky_save	Sparky .save files for each spectrum recorded.
data_files
fit_files
spectra	Raw NMR data for each experiment.

Raw Spectrometer FID data
fid	Raw Varian binary data or raw Bruker binary data for 1D experiments
ser	Raw Bruker binary data for multi-dimensional experiments
Acquisition parameter files
procpar	Varian acquisition parameter files
acqu, acqus, acqu2, acqu2s, etc.	Bruker acquisition files for each dimension before and after data acquisition.
NMRPipe processing scripts
fid.com	convert from spectrometer data to NMRPipe format
ft2.com	convert NMRPipe FID data into FT-processed data. This script is commonly used for processing HSQC experiments.
ipap_ft2.com	convert NMRPipe ipap_proc FID data into FT-processed data. This script is commonly used for processing CBCGCO IPAP experiments.
Spectra for analysis
*.ucsf	spectral files for analysis in Sparky – All NMRPipe spectra were converted to ucsf format using the Sparky pipe2ucsf script.)
*.save

Software	Version	Website
Python	2.6	http://www.python.org/download/
R	>= 2.10.0	http://www.r-project.org/
RPy2	>= 2.1.0	http://rpy.sourceforge.net/
nlme	any	http://cran.r-project.org/web/packages/nlme/index.html
Pywin32 extension (Win)	any	http://sourceforge.net/projects/pywin32/

fit_type	one_site_mHb, one_site or two_site_b (these are defined in fitting_models.py)
fit_program	R
atom_type	HN, N, CO, CA, CB, CG, CD, HB2, HB3, etc.
res_type	ASP, GLU, HIS, etc.
which	select or all (‘select’ means that only those resonances listed below the fit_type line will be fitted, while ‘all’ means that all resonances with the same residue type and atom type will be fitted. If fitting parameters are not listed for some, default initial guesses will be used. Initial guesses are listed in the fitting_models.py file.

titr_fit_summary.txt	A tab-delimited text file listing initial guesses, fit name, final fit parameters and standard errors
fit_files directory	Director which contains fit summary files for each resonance
Python pickle	A text file that stores all of the titration data and fit data into one file read by subsequent tables.py and figures2.py scripts

Plot Type	make_figinstr.py input parameters
D, E, COOH using Cg/Cd	COOH
D, E, COOH using Cb/Cg	COOH CBCG
D, E, COOH using Hb2/Hg2	COOH H2
D, E, COOH using Hb3/Hg3	COOH H3
H^N	HN

Parameter	Description
rows and columns	specifies the layout of how the figures should be plotted on a page
page_margins	specifies outer margins outside the figure plotting area (margins are listed in the order left, bottom, right, up, and separated by ‘_’
x_range	COOH H2
y_amplitude	COOH H3
plot_margins	HN
show_pKa	specifies whether the pKa value from a fit will be displayed or not. name: defines the title of the page